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III this Issue 

• Hewlett- Packard's next-generation computers are now under development 

in the program code-named Spectrum, and are scheduled to be introduced 
^ in 1986. In our August 1985 issue, Joel Birnbaum and Bill Worley discussed 

the philosophy and the aims of the new computers and HP"s architecture, 
_. - which has been variously described as reduced-complexity, reduced mstruc- 

'^ tion set computer (RISC), or high-precision. Besides providing higher perfor- 

mance than existing HP computers, an important objective for the new ar* 
chitecture is to support efficient high-level language development of systems 
and applications software. Compatibility with existing software is another 
important objective. The design of high-level language compters is extremely important to the 
new computers, and in fact, the architecture was developed jointly by both hardware and software 
engineers, (n the article on page 4. three HP compiler designers describe the new compiler 
system. At introduction, there will be Fortran, Pascal, COBOL, and C compilers, with others to 
become available later. An optional component of the compiler system called the optimizer tailors 
the object code to realize the full potential of the architectural features and make programs run 
faster on the new machines. As much as possible, the compiler system is designed to remain 
unchanged for different operating systems, an invaluable characteristic for application program 
development. In the article, the authors debunk several myths about RISCs, showing that RISCs 
don't need an architected procedure call, don't cause significant code expansion because of the 
simpler instructions, can readily perform integer multiplication, and can indeed support commercial 
languages such as COBOL. They also describe miliicode, HP's implementation of complex func- 
tions using the simple instructions packaged into subroutines. Miliicode acts like microcode in 
more traditional designs, but is common to all machines of the family rather than specific to each. 

The article on page 20 introduces the HP 7090 A Measurement Plotting System and the articles 
on pages 24, 27, and 32 expand upon various aspects of its design. The HP 7090A is an X-Y 
recorder, a digital plotter, a low-frequency waveform recorder, and a data acquisition system all 
in one package. Although all of these instruments have been available separately before, for 
some measurement applications where graphics output is desired there are advantages to having 
them all together. The analog-to-digital converter and memory of the waveform recorder extend 
the bandwidth of the X-Y recorder well beyond the limits of the mechanism (3 kHz instead of a 
few hertz). The signal conditioning and A-to-D conversion processes are described in the article 
on page 32, The servo design (page 24) is multipurpose— the HP 709QA can take analog inputs 
directly or can plot vectors received as digital data. A special measurement graphics software 
package (page 27) is designed to help scientists and engineers extend the stand-alone HP 
7090 A's capabilities without having to write their own software. 

No matter how good you think your design is, it will confound some users and cause them to 
circumvent your best efforts to make it friendly. Knowing this, HP's Personal Computer Division 
has been conducting usability tests of new PC designs. Volunteers who resemble the expected 
users are given a series of tasks to perform. The session is videotaped and the product's designers 
are invited to observe. The article on page 36 reports on the sometimes humorous and always 
valuable results. 

-R.P. Dofan 

What's Ahead 

The February issue will present the design stones of three new HP instrument offerings. The 
cover subject will be the HP 5350A, HP 5351 A, and HP 5352A Microwave Frequency Counters, 
which use gallium arsenide hybrid technology to measure frequencies up to 40 GHz, Also featured 
will be the HP 8757A Scalar Network Analyzer, a transmission and reflection measurement system 
for the microwave engineer, and the HP 3457A Multimeter, a seven-function, 3Y2-to-6Vfe-digit 
systems digital voltmeter. 

The HP Journal encourages technical disc usscon o*l be lopics presented in recent articles and will putilish letters expected to Deof ■ ■ readers. Letters™ si DeDnel and are sufoiecl 

id editing. Letters should be addressed te: Edite* Hc*ie?!1 RftCfc&rd Journal, 3000 Hanover Street. P,i I UM. U S A 
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Compilers for the New Generation of 
Hewlett-Packard Computers 

Compilers are particularly important for the reduced- 
complexity, high-precision architecture of the new 
machines. They make it possible to realize the full potential 
of the new architecture. 

by Deborah S. Coutant, Carol L. Hammond, and Jon W. Kelley 



WITH THE ADVENT of any new architecture, com- 
pilers must be developed to provide high-level 
language interfaces to the new machine. Compilers 
are particularly important to the reduced-complexity, high- 
precision architecture currently being developed at Hewlett- 
Pat kard in the pmi;nim % 1 1 r s r has been ( odo-n;mH:d Spectrum. 
The Spectrum program is implementing an architecture that 
is similar in philosophy to the class of architectures called 
RiSCs (reduced instruction set computers), 1 The importance 
of compilers to the Spectrum program was recognized at 
its inception. From the early stages of the new architecture's 
development, software design engineers were involved in 
iis specification. 

The design process began with a set of objectives for the 
new architecture." These included the following: 

■ It must support high-level language development of sys- 
tems and applications software. 

■ It must be scalable across technologies and implementa- 
tions* 

■ It musl provide compatibility with previous systems. 
These objectives were addressed with an architectural 

design that goes beyond RISC. The new architecture has 
the following features: 

■ There are many simple instructions, each of which exe- 
cutes in a single cycle. 

■ There are 32 high-speed general-purpose registers. 

■ There are separate data and instruction caches, which 
are exposed andean be managed explicitly by the operat- 
ing system kernel, 

■ The pipeline has been made visible to allow the software 
to use cycles normally lost following branch and load 
instructions. 

■ Performance can be tuned to specific applications by 
adding specialized processors that interface with the 
central processor at the general-register, cache, or main 
memory levels, 

The compiling system developed for this high-precision 
architecture* enables high-level language programs to use 
these features. This paper describes the compiling system 
design and shows how it addresses the specific require- 
ments of the new architecture. First, the impact of high- 
level language issues on the early architectural design de- 
cisions is described. Next, the low- level structure of the 

'The lerm "high-precision arcHii.eL'tj T e ' is used iracause The ^siruction set tot Ihe ns* 
architecture was chasen on the tssis of execution frequency as determined by extensive 
measurements across a variety Of workloads 



compiling system is explained, with particular emphasis 
on areas that have received special attention for this ar- 
chitecture: program analysis, code generation, and optimi- 
zation. The paper closes with a discussion of RISC-related 
issues and how they have been addressed in this compiling 
system, 

Designing an Architecture for High -Level Languages 

The design of the new architecture was undertaken by 
a team made up of design engineers specializing in 
hardware, computer architecture, operating systems, per- 
formance analysis, and compilers. It began with studies of 
computational behavior, leading to an initial design that 
provided efficient execution of frequently used instruc- 
tions, and addressed the trade-offs involved in achieving 
additional functionality. The architectural design was scru- 
tinized by software engineers as it was being developed, 
and their feedback helped to ensure that compilers and 
operating systems would be able to make effective use of 
the proposed features, 

A primary objective in specifying the instruction set was 
to achieve a uniform execution time lor all instructions. 
All instructions other than loads and branches were to be 
realizable in a single cycle. No instruction would be in- 
cluded that required a significantly longer cycle or signif- 
icant additional hardware complexity- Restricting all in- 
structions by these constraints simplifies the control of exe- 
cution. In conventional microcoded architectures, many in- 
structions pay an overhead because of the complexity of 
control required to execute the microcode. In reduced-com- 
plexity computers, no instruction pays a penalty for a more 
complicated operation. Functionality that is not available 
in a single-cycle instruction is achieved through multiple- 
instruction sequences or. optionally, with an additional 
processor. 

As the hardware designers began their work on an early 
implementation of the new architecture, they were able tn 
discover which instructions were costly to implement, re- 
quired additional complexity not required by other instruc- 
tions, or required long execution paths, which would in- 
crease the cycle time of the machine. These instructions 
were either removed, if Ihe need for them was not great, 
or replaced with simpler instructions that provided the 
needed functionality. As the hardware engineers provided 
feedback about which instructions were too costly to in- 



4 HEWLETT PACKARD JOURNAL JANUARY 1SB6 



)Copr. 1949-1998 Hewlett-Packard Co. 



elude, the software engineers investigated alternate ways 
of achieving the same functionality. 

For example, a proposed instruction that provided 
hardware support for a 2-bit Booth multiplication was not 
included because the additional performance it provided 
was not justified by its cost. Architecture and compiler 
engineers worked together to propose an alternative to this 
instruction. Similarly, several instructions that could be 
used directly to generate Boolean conditions were deleted 
when tli. ered to require a significantly longer 

cycle time, The same functionality was available with a 
more general two- instruct ion sequence, enabling all other 
operations to be executed faster. 

The philosophy of reduced-complexity computers in- 
cludes the notion that the frequent operations should he 
fast, possibly at the expense of less frequent operations 
However, the cost of an infrequent operation should not 
be so great as to counterbalance the efficient execution of 
the simple operations. Each proposed change to the ar- 
chitectural specification was analyzed by the entire group 
to assess its impact on both software and hardware im- 
plementations. Hardware engineers analyzed the instruc- 
tion set to ensure that no single instruction or set of instruc- 
tions was causing performance and/or cost penalties for 
the entire architecture, and software engineers worked to 
ensure that all required functionality would be provided 
within performance goals, Compiler writers helped to de- 
fine conditions for arithmetic, logical, and extract deposit 
instructions, and to specify where carry /borrow bits would 
be used in arithmetic instructions, 

As an example of such interaction, compiler writers 
helped to tone a conditional branch nullification scheme 
to provide for I he most efficient execution of the most 
common branches. Branches are implemented soch that 
an instruction immediately following the branch can be 
executed before the branch takes effect. 1 This allows the 
program to avoid losing a cycle if useful work is possible 
at that point, For conditional branches, the compiler may 
or may not be able to schedule an instruction in this slot 
that can he executed in both the taken-branch and nom 
hiken-branch cases. For these branches, a nullification 
u heme was devised which allows an instruction to be 
executed only in the case ol a taken branch for backw.mf 
branches, and only in the case of a non-taken branch for 
forward branches. This scheme was chosen to enable all 
available cycles to be used in the must common cases. 
Backward conditional branches are most often used in a 
loop, and such branches will most often betaken, branching 
backwards a number of times before falling through at the 
end of the iteration. Thus, a nullification scheme that al- 
lows this extra cycle to be used in the I; 
causes this cycle to be used most often. Conversely, for 
forward branches, the nullification scheme was tuned to 
the non -taken-branch case, Fig. 1 shows the code generated 
for a simple code sequence, illustrating the conditional 
branch nullification scheme, 

Very early in the development of the architectural spec if i- 
i ation, work was begun on a simulator for the new com- 
puter architecture and a prototype C compiler, Before the 
design was frozen, feedback was available about the ease 
with which high-level language constructs could be trans- 
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Frg. 1, An illustration of the conditional branch nullification 
scheme (a) The conditional branch at the end of a loop wtll 
often be followed by a copy of the first instruction of she loop 
Thts instruction wilt only be executed if the branch is taken 
(b) The forward conditional branch implementing an if state- 
ment wtll often be followed by the first instruction of the then 
clause, allowing use of this cycle without rearrangement of 
code. This instruction will only be executed if the branch is 
not taken 

lated to the new instruction set. The early existence of a 
prototype compiler and simulator allowed operating sys- 
tem designers to begin their development early, and enabled 
them to provide better early feedback about their needs, 
Irnm the architecture as well as the compiler. 

At the same time, work was begun on optimization tech- 
niques for the new architecture. Segments of compiled code 
were hand-analyzed to uncover opportunities for optimiza- 
tion. These hand -optimized programs were used as a 
guideline for Implementation and to provide a performance 
goal. Soon after the first prototype compiler was developed, 
a prototype register allocator and instruction scheduler 
were also implemented, providing valuable data for the 
optimizer and compiler designers. 

Compiling to 3 Reduced Instruction Set 

Compiling for a reduced-complexity computer issimpli- 
i it'll in some aspects. With a limited set of instructions 
from which to choose, code generation can be straightfor- 
ward. However, optimization is necessary to realize the 
full advantage of the architectural features, The new HP 
compiling system is designed to allow multiple languages 
to be implemented with language-specific compiler front 
ends. An optimization phase, common to all of the languages, 
provides efficient register use and pipeline scheduling, and 
eliminates unnecessary computations. With the elimina- 
tion of complex instructions found in many architectures* 
the responsibility for generating the proper sequence of 
instructions for high-level language constructs falls to the 
compiler. Using the primitive instructions, the compiler can 
construct precisely the sequence required for the application. 

For this class of computer, the software architecture plays 
a strong role in the performance of compiled code. There 
is do procedure call instruction, so the procedure calling 
sequence is tuned lo handle simple cases, such as leaf 
routines [procedures that do not call any other procedures), 
without fixed expense, while still allowing the com- 
plexities of nested and recursive procedures. The saving 
procedure call and procedure entry is depen- 

(COrilinued on page 7| 
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Components of the Optimizer 



The optimizer is composed of two types of components, those 

lhat perform data flow and control How analysis, and those that 
perform optimizations. The information provided by the analysts 
components is shared by ihe optimization components and ts 
used to determine when instructions can be deleted, moved, 
rearranged; or modified 

For each procedure, the control flow analysrs identifies basic 
blocks (sequences of code that have no internal branching). 
These are combined into intervals, winch Form a hierarchy of 
control structures, Basic blocks are at the bottom of this hierarchy, 
and entire procedu res are at the top . Loops and if-then constructs 
are examples of the intermediate structures 

Data flow information is collected for each interval. IE is ex- 
pressed in terms of resource numbers and sequence numbers. 
Each register, memory location, and intermediate expression has 
a unique resource number, and each use or definition of a re- 
source has a unique sequence number. Three types of data Now 
information are ca leu la ted- 

■ Reaching definitions: for each resource, the set of definitions 
that coutd reach the top of the Interval by some path. 

■ Exposed uses: for each resource, the sef of uses mat could 
be reacned by a definition at the bottom of the interval. 

■ undef set the sei of resources that are not available at the 
top of the interval A resource is available if it is defined afong 
a'l paths reaching -the interval, and none of its operands are 
later redefined along that path. 

From this information, a fourth data structure is built 

■ Web a set of sequence numbers having the property that for 
each use In the set, all definitions that might reach it are also 
in ihe sel. Likewise, for each definition m the set. ail uses it 
might reach are also in the set. For each resource there may 
be one or many webs. 

Loop Optimizations 

Frequently the majority of execution lime in a program is spent 
executing instructions contained in loops. Consequently, loop- 
based optimizations can potentially improve execution time sig- 
nificantly. The following discussion describes components that 
perform loop optimizations. 

Loop Invariant Code Motion, Computations within a loop that 
yield the same result for every iteration are called toop invariant 
computations. These computations can potentially be moved 
outside the loop, where they are executed less frequently 

An instruction inside the toop is invariant if it meets either of 
two conditions: either the reaching definitions for all its operands 
are outside the loop, or its operands are defined by instructions 
that have already themselves been identified as loop invanart 
In addition, there must not be a conflicting definition of the instruc- 
tion's target inside the loop. If the instruction is executed condi- 
tionally inside the loop, it can be moved out only if there are no 
exposed uses of the target at the loop exit. 

An example is a computation involving variables that are not 
modified in the loop. Another is the computation of an array's 
base address. 

Strength Reduction and Induction Variables. Strength reduction 
replaces multiplication operations inside a loop with iterative ad- 
dition operations. Since there tS no hardware instruction for in- 
teger multiplication In the architecture, converting sequences of 
shifts and adds to a single instruction is a performance improve- 
ment Induction variables are variabtes that are defined inside 
Ihe loop in terms of a simple function of Ihe loop counter 



Once the induction variables have been determined, those 
that are appropriate for this optimization are selected Any mul- 
tiplications involved in the computation of these induction vari- 
ables are replaced with a copy from a temporary This temporary 
hoids the initial value of the function, and is initialized preceding 
the loop. It is updated at the point of all the reaching definitions 
of the induction variable with an appropriate addition instruction 
Finally, the induction variable itself is eliminated if possible. 

This optimization is frequently applied to the computation of 
array indices inside a loop, when the index is a function of the 
loop counter 

Common Subexpression Elimination 

Common subexpression elimination is The removal of redun- 
dant computations and the reuse of the one result. A redundant 
computation can be deleted when its target is not in the undef 
set for the basic block it is contained in, and all the reaching 
definitions of the target are the same instruction. Since the op- 
timizer runs at the machine level, redundant loads of the same 
variable in addition to redundant arithmetic computations can 
be removed. 

Store-Copy Optimization 

It is possible to promote certain memory resources to registers 
for the scope of their definitions and uses. Only resources that 
satisfy aliasing restrictions can be transformed this way. If the 
transformation can be performed, stores are converted to copies 
and the loads are eliminated This optimization is very useful for 
a machine that has a large number of registers, since it maximizes 
the use of registers and minimizes the use of memory. 

For each memory resource there may be multiple webs. Each 
memory web is an independent candidate for promotion to a 
register 

Unused Definition Elimination 

Definitions of memory and register resources that are never 
used are removed. These definitions are identified during the 
building of webs 

Local Constant Propagation 

Constant propagation involves the folding and substitution of 
constant computations throughout a basic block. If the result of 
a computation is a constant, the instruction is deleted, and the 
resultant constant is used as an immediate operand in sub- 
sequent instructions that reference the original result. Also, if the 
operands of a conditional branch are constant, the branch can 
be changed to an unconditional branch or deleted. 

Coloring Register Allocation 

Many components introduce additional uses of registers or 
prolong the use of existing registers over larger portions of the 
procedure. Near-optimal use of the available registers becomes 
crueial after these optimizations have been made 

Global register allocation based on a method of graph coloring 
is performed, The register resources are partitioned into groups 
of disjoint definitions and uses called register webs Then, using 
the exposed uses information, interferences between webs are 
computed An interference occurs when two webs must be as- 
signed different machine registers Registers that are copies of 
each other are assigned to the same register and the copies are 
eliminated The webs are sorted based on the number of interfer- 
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ences each contains. Then register assignment is done 

' lenng When the register allocator runs out of registers, 
il trees a register by saving another one to memory temp : 
A heuristic algorithm is used to choose which register to save 
For e- ^d heavify wrthm a loop wnl no! be 

saved to free a register 

Peephole Optimizations 

The peephole . ses a d s ct ion any of & 9 * njc- 

^on patterns to simplify instruction sequences Some ct the pat- 
terns identify simplifications to addressing mode changes, bit 
manipulations, and data type conversions 

Branch Optimizations 

The branch optimizer component traverses the instructions. 
transforming branch instruction sequences mto more efficient 
instruction sequences It converts branches over single instruc- 
tions to instructions with conditional nullification A branch whose 
target is the next instruction is deleted Branch chains involving 



both unconditional and condrftonai branches are combined into 
shorter sequences wherever possible For example, a cone 

h to an unconditional brar ^ed to a single condi- 

tional branch 

Dead Code Elimination 

Dead code is code that cannot be reached at program e 
tion, since no branch to I 
deleted 

Scheduler 

The instruction scheduler reorders the instructions wrtJ 
basic block, minimizing road/store and floating-point interlocks 
It also schedules the instructions following branches. 

Suneet Jam 

Development Engineer 

Information Technology Group 
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dent on the register use of the individual procedure. A 
special calling convention has been adopted to allow some 
complex operations to be implemented m low- level 
routines known as mil Jicode, which incur linle overhead 
for saving registers and status- 
Compiling to a reduced instruction set can be simplified 
because the compiler need not make complicated choices 
among a number of instructions that have similar effects. 
In the new architecture, all arithmetic, logical, or condi- 
tionaJ instructions are register-based. All memory access 
is Hone through explicit toads and stores. Thus the compiler 
need not choose among instructions with a multitude of 
addressing modes. The compiler's task is further simplified 
by the fact that the instruction set has been constructed in 
a very symmetrical manner, All instructions are the Bame 
length, and there are I limited number oi instruction for- 
mats. In addition to simplifying the task of code generation, 
thifi makes the I ask of optimization easier as welL The 
optimizer need not handle transformations between in* 
struxtions that have widely varying formats and addressing 
modes. The symmetry of the instruction set makes the tasks 
of replacing or deleting one or more instructions much 
easier. 

Of course, the reduced instruction set computer, though 
simplifying some aspects of the- compilation, requires more 
of the compilers in other areas. Having a Large number of 
registers places the burden on the compilers to generate 
code that can use these registers efficiently. Other aspects 
of this new architecture also require the compilers to be 
more intelligent about code generation. For example, the 
instruction pipeline has become more exposed and, as men- 
tioned earlier, the instruction following a branch may he 
executed before the branch takes effect, The compiler there- 
fore needs to schedule such instructions effectively, In ad- 
dition, loads from memory, which also ret] u ire more than 
a single cycle, will interlock with the following instruction 
il the target register is used immediately. The compiler can 
Increase execution speed by scheduling instructions to 
avoid these interlocks. The optimizer can also improve the 
effectiveness of a float ing-pojul coprocessor by eliminating 



unnecessary coprocessor memory accesses and by reorder- 
ing the floating-point instructions. 

In addition to such optimizations, which are designed 
to exploit specific architectural features, conventional op- 
timizations such as common subexpression elimination, 
loop invariant code motion, induction variable elaboration, 
and local constant propagation were also implemented.* 
These have a major impact on the performance of any com- 
puter. Such optimizations reduce the frequency of loads, 
stores, and multiplies, and allow the processor to be used 
with greater, efficiency. However, the favorable cost/perfor- 
mance of the new HP architecture can be realized even with- 
out optimization. 

The Compiler System 

All of the compilers for I he new architecture share a 
common overall design structure, This allows easy inl 
tion of common functional components including a sym- 
bolic debugger, a code generator, an optimizer and a linker. 
This integration was achieved through detailed planning, 
which Involved the participation of engineers across many 
language products. Of the new compilers, the Fortran 77, 
Pascal, and COBOL compilers will appear very familiar to 
some of our customers, since they were developed from 
existing products available on the HP 3000 family of com- 
puters. All of these compilers conform to HP standard 
i filiations for their respective languages, and thus will 
provide smooth migration from the HP 1000. HP 3000, and 
HP 9000 product lines. The C compiler is a new product, 
and as mentioned earlier, was the compiler used to pro* 
totype the instruction set from its earliest design phase. 
The r compiler conforms to recognized industry standard 
language specifications, Other compilers under develop- 
ment will bi d into this compiler system. 

To achieve successful integration of compilers into a 
homogeneous compiling system it was necessary to define 
distinct processing phases and their exacl interfaces in 
terms of rial a and control transfer, Each compiler be 

"Hun through the front end- This includes the lexical, 
syntactic, and semantic analysis prescribed by each Ian- 
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gnage standard. The front ends generate intermediate codes 
from the source program, and pass these codes to the i.u>li- 
generators. The intermediate codes are at a higher level 
than the machine code generated by a later phase, and 
allow a certain degree of machine abstraction within the 
front ends. 

Two distinct code generators are used, They provide 
varying degrees of independence from (he front ends. Each 
interfaces to the front ends through an intermediate code, 
One of these code generation techniques has already been 
used in two compiler products for the HP 3000. Fig. 2 
shows the overall design of the compilers. Each phase of 
the compilation process is pictured as it relates to the other 
phases. The front ends are also responsible for generating 
data to be used later in the compilation process. For exam- 
ple, the front end generates data concerning source state- 
ments and the types, scopes and locations of procedure/ 
function and variable names for later use by the symbolic 
debugger. In addition, the front end is responsible for the 
collection of data to be used by the optimizer, 

These compilers can be supported hy multiple operating 
systems. The object file format is compatible across operat- 
ing systems. 

Code Generation 

The code generators emit machine code into a data struc- 
ture called SLLIC (Spectrum low-level intermediate code], 
SLUG also contains information regarding branches and 
their targets, and thus provides the foundat ion for the build- 
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Fig, 2, The compiler system for HP's new generation of high- 
precision -architecture computers 



ing of a control flow graph by the optimizer. The SLLIC 
data structure contains the machine instructions and the 
specifications for the run-time environment, including the 
program data space, the literal pooh and data initialization* 
SLLIC also holds the symbolic debug in formation generated 
by the front end, is the medium for later optimization, and 
is used to create the object file. 

The reduced instruction set places some extra burden 
on the code generators when emitting code for high-level 
language constructs such as byte moves, decimal opera- 
tions, and procedure calls. Since the instruction set con- 
tains no complex instructions to aid in the implementation 
of these constructs, the code generators aie forced to use 
combinations of the simpler instructions to achieve the 
same functionality. However, even in complex instruction 
set architectures, complex case analysis is usually required 
to use the complex instructions correctly, Since there is 
little redundancy in the reduced instruction set, most often 
no choice of alternative instruction sequences exists. The 
optimizer is the best place for these code sequences to be 
streamlined, and because of this the overall compiler de- 
sign is driven by optimization considerations. In particular, 
the optimizer places restrictions upon the code generators. 

The first class of such restrictions involves the presenta- 
tion of branch instructions. The optimizer requires that all 
branches initially be followed by a NOP (no operation] in- 
struction. This restriction allows the optimizer to schedule 
instructions easily to minimize interlocks caused by data 
and register access. These MOPs are subsequently replaced 
with useful instructions, or eliminated. 

The second class of restrictions concerns register use, 
Register allocation is performed within the optimizer. 
Rather than use the actual machine registers, the code 
generators use symbolic registers chosen from an infinite 
register set. These symbolic registers are mapped to the set 
of actual machine registers by the register allocator. Al- 
though register allocation is the traditional name for such 
an activity, register nssignmenf is more accurate in this 
context. The code generators are also required to associate 
every syntactically equivalent expression in each proce- 
dure with a unique symbolic register number, The symbolic 
register number is used by the optimizer to associate each 
expression with a value number (each run-time value has 
a unique number). Value numbering the symbolic registers 
aids in the detection of common subexpressions within 
the optimizer. For example, every time the local variable 
i is loaded it is loaded into the same symbolic register, and 
even' time the same two symbolic registers are added to- 
gether the result is placed into a symbolic register dedicated 
to hold that value. 

Although the optimizer performs transformations at the 
machine instruction level, there are occasions where it 
could benefit from the existence of slightly modified and/or 
additional instructions. Pseudoinsfructions are instruc- 
tions that map to one or more machine instructions and 
are only valid within the SLLIC data structure as a software 
convention recognized between the code generators and 
the optimizer. For example, the NOP instruction mentioned 
above is actually a pseudoinstruction. No such instruction 
exists on the machine, although there are many instruction/ 
operand combinations whose net effect would be null. The 
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NOP pseudoinstruction saves the optimizer from having to 

recognize all those sequences. Another group of pseudo- 
instructions has been defined to allow the optimizer to 
view all the actual machine instructions in the same canon- 
ical form, without being restricted by the register use pre- 
scribed by the instructions. For example, some instructions 
use the same register as both a source and a target. This 
makes optimization very difficult for thai instruction. The 
solution involves the definition of a set of pseudo- 
instructions, each of which maps to a two-instruction se- 
quence , first to copy the source register to a new symbolic 
register, and then to perform the operation on that new 
register. The copy instruction will usually be eliminated 
by a later phase of the optimizer. 

Another class of perhaps more important pseu do instruct 
tions involves the encapsulation of common operations 
that are traditionally supported directly by hardware, but 
in a reduced instruction set are only supported through 
the generation of code sequences. Examples include mul- 
tiplication, division, and remainder; Rather than have each 
node generator contain the logic to emit some correct se- 
quence of instructions to perform multiplication, a set of 
pseudoinstructions has been defined that makes it appear 
as if a high-level multiplication instruction exists in the 
architecture. Each of the pseudoinstructions is defined in 
terms of one register target and either two register operands 
or one register operand and one immediate. The use of 
these pseudoinstructions also aids the optimizer in the 
detection of common subexpressions, loop invariants, and 
induction variables by reducing the complexity of the code 
sequences the optimizer must recognize. 

Control flow restrictions are also placed on generated 
code. A bask: block is defined as a straight-line sequence 
of code that contains no transfer of control out of or into 
its midst. If the code generator wishes to set the carry /bor- 
row bit in the status register, it must use that result within 
the same basic block. Otherwise, the optimizer cannot 
guarantee its validity. Also, all argument registers far a 
proesdui in en II must be loaded in the same basic 

block that contains the procedure call. This restriction 
he] [is the register allocator by limiting the instances where 
hard-coded (actual] machine registers can be Jive (active] 
across basic block boundaries. 

Optimization 

After the SLL1C data structure lias been generated by the 
code generator, a call is made to the optimizer so that it 
can begin its processing. The optimizer performs intra pro- 
cedural local and global optimizations, and can be turned 
on and off on a procedure-by-procedure basis by the pro- 
grammer through the use of compi ler options and directives 
specific to each compiler, Three levels of optimization are 
supported and can also be selected at the procedural level. 

Optimization is implemented at the machine instruction 
level for two reasons. First, since the throughput of the 
processor is most affected by the requests made of the mem- 
ory unit and cache, optimizations that reduce the number 
of requests made, and optimizations that rearrange these 
requests to suit the memory unit best, are of the most value, 
it is only at the machine level thai all memory' accesses 
become exposed, and are available candidates for such op- 



timizations. Second, the machine level is the common de- 
nominator for all the compilers, and will continue to be 
for future compilers for the architecture. This allows the 
implementation of one optimizer for the entire family of 
compilers. In addition to very machine specific optimiza- 
tions, a number of theoretically machine independent op- 
timizations (tor example, loop optimizations) are also in- 
cluded. These also benefit from their low-level implemen- 
tation, since all potential candidates are exposed. For exam- 
ple, performing loop optimizations at the machine level 
allows the optimizer to move constants outside the loop, 
since the machine has many registers to hold them. In sum- 
mary, no optimization has been adversely affected by this 
strategy; instead , there have been only benefits. 

Level optimization is intended to be used during pro- 
gram development. It is difficult to support symbolic de- 
bugging in the presence of all optimizations, since many 
optimizations reorder or delete instruction sequences. Non- 
symbolic debugging is available for fully optimized pro- 
grams* but users will still find it easier to debug nonop- 
timized code since the relationship between the source and 
object code is clearer. No code transformations are made 
at level that would preclude the use of a symbolic debug- 
ger. In particular, level optimizations include some copy 
and NOP elimination, and limited branch scheduling. In 
addition, the components that physically exist as part of 
the optimizer, but are required to produce an executable 
program, are invoked. These include register allocation and 
branch fixing (replacing short branches with long branches 
where necessary). 

After program correctness has been demonstrated using 
only level optimizations, the programmer can use the 
more extensive optimization levels. There are two addi- 
tional levels of optimization, either of which results in 
code reordering. The level any particular optimization 
component falls into is dependent upon the type of infor- 
mation it requires to perform correct program transforma- 
. The calculation of data flow information gives the 
optimizer information regarding all the resources in the 
program. These resources include general registers, dedi- 
cated and status registers, and memory locations [vari- 
ables], The information gleaned includes where each re* 
source is defined and used within the procedure, and is 
critical for some optimization algorithms. Level 1 optimi- 
zations require no data flow information, therefore adding 
only a few additional optimizations over level 0, Invoking 
the optimizer at level 2 will cause all optimizations to be 
performed, This requires data flow information to be calcu- 
lated. 

Level 1 optimization introduces three new optimiza- 
tions: peephole and branch optimizations and full instruc- 
tion scheduling. Peephole optimizations are performed by 
pattern matching short instruction sequences in the code 
to corresponding templates in the peephole optimizer. An 
i pie of a transformation is seen in the C source expres- 
sion 

rf (flag & 0*8) 

whirl i test! to see that the fourth bit from the right is set 
in the integer flag, The un optimised code is 
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LDO 8(0), 19 

AND 31,19,20 

COMiBT,- 0,20, label 



load immediate 8 into 1 1 9 
intersect r3l (flag) with rl 9 into r20 
compare result against and branch 



Peephole optimization replaces these three instructions 
with the one instruction 



BB, - 



31 ,28, label 



branch on bit 



which will branch if bit 28 [numbered left to right from 0) 
in r31 [the register containing flag) is equal to f). 

Level 1 optimization also includes a branch optimizer 
whose task is to eliminate unnecessary branches and some 
unreachable code. Among other tasks, it replaces branch 
chains with a single branch, and changes conditional 
branches whose targets are unconditional branches to a 
single conditional branch. 

The limited instruction scheduling algorithm of level 
is replaced with a much more thorough component in level 
1. Level scheduling is restricted to replacing or removing 
the NOPs following branches where possible* since code 
sequence ordering must lie preserved for the symbol it: de- 
bugged In addition to ibis, level 1 instructions are sched- 
uled with the goal of minimizing memory interlocks. The 
following typify the types of transformations made: 

■ Separate a lead from the instruction that uses the Loaded 
register 

■ Separate store and load instruction sequences 

■ Separate floating-point instructions from each other to 
improve throughput of the floating-point unit. 
Instruction scheduling is accomplished by first con- 
structing a dependency graph that details data dependen- 
cies between instructions. Targeted instructions are sepa- 
rated by data independent instructions discovered in the 
graph. 

The same register allocator is used in level and level 
1 optimization. It makes one backwards pass over each 
procedure to determine where the registers are defined and 
used and whether or not they are live across a calL It uses 
this information as a basis for replacing the symbolic regis- 
ters with actual machine registers. Some copy elimination 
is also performed by this allocator. 

Level 2 optimizations include all level 1 optimizations 
as well as local constant propagation, local peephole trans- 
formations, local redundant definition elimination, com- 
mon subexpression and redundant load/store elimination, 
loop invariant code motion, induction variable elaboration 
and strength reduction, and another register allocator. The 
register allocator used in level 2 is partially based on graph 
coloring technology."' Fully optimized code contains many 
more live registers than partially optimized or nonop- 
timized code, This register allocator handles many live 
registers better than the register allocator of levels and 
1. tt has access to the data flow information calculated for 
the symbolic registers and information regarding the fre- 
quency of execution for each basic block. 

Control Flow and Data Flow Analysis 

All of the optimizations introduced in level 2 require 
data flow information. In addition, a certain amount of 
control flow information is required to do loop-based op- 



timizations. Data flow analysis provides information to the 
optimizer about the pattern of definition and use of each 
resource. For each basic block in the program, data flow 
information indicates what definitions may reach the block 
(reaching definitions) and what later uses may be affected 
by local definitions [exposed uses]. Control flow informa- 
tion in the optimizer is contained in the basic block and 
interval structures. Bask: block unaiysh identifies blocks 
nf f:n(]i; i hat have no internal branching. JntervoJ analysis 
identifies patterns of control flow such as if-then-else and 
loop constructs, 5 Intervals simplify data flow calculations, 
identify loops tor the loop-based optimizations, and enable 
partial update of data flow information. 

In the optimizer, control flow analysis and data flow 
analysis are performed in concert. First, basic blocks are 
identified. Second, local data flow information is calcu- 
lated for each basic block. Third, interval analysis exposes 
the structure of the program. Finally, using the interval 
structure as a basis for its calculation rules, global data 
flow analysis calculates the reaching definitions and ex- 
posed uses. 

Basic block analysis of the SLLIC data structure results 
in a graph structure where each basic block identifies a 
sequence of instructions, along with the predecessor and 
successor basic blocks. The interval structure is built on 
top of this, with the smallest interval being a basic block. 
Intervals other than basic blocks contain sublntervals 
which may themselves be any type of interval, Interval 
types include basic block, sequential block (the subinter- 
vals follow each other in sequential order), if-then, if-then- 
else, self loop, while loop, repeat loop, and switch (case 
statement). When no such interval is recognized, a set of 
subintervals may be contained in either a proper internal 

Sequential Block 



Basic Block 
i :=0; 



Repeat Loop Interval 
repeal begin 


1 IF-Then Interval 






Basic Block 

If ap] oothen 




1 








Basic Block 
a[i] :=1- 




■ 










Basic Block 




i :=i + 1; 

end until i = 10 




I 



Fig. 3. This figure illustrates the interval structure of a simple 
sequence of Pascal code. The nested boxes represent the 
interval hierarchy. 
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(if the control flow is well-behaved) or an improper interval 
l if it contains multiple-entry cycles or targets of unknown 
branches). An entire procedure will be represented by a 
single interval with multiple descendants. Fig. 3 shows the 
interval structure for a simple Pascal program. 

. iculation of data flow information begins with an 
analysis of what resources are used and defined by each 
basic block. Each use or definition of a resource is ident 
by a unique sequence number ;ted with each se- 

quence number is information regarding what resource is 
being referenced, and wbt - a use or a definition. 

Each SLiiC instruction entry contains sequence nun ■■ : 
for all of the resources defined or used by that instruction. 
The Local data flow analysis determines what local uses 
are exposed at the top of the basic block (i.e., there is a use 
of a resource with no preceding definition in that block) 
and what local definitions will reach the end of the block 
(i.e., they define a resource that is not redefined later in 
the block). The local data flow analysis makes a forward 
and backward pass through the instructions in a basic block 
to determine this information, 

Local data flow information is propagated out from the 
basic blocks to the outermost interval. Then, information 
about reaching definitions and exposed uses is propagated 
inward to the basic block level. For known interval types, 
involves a straightforward calculation for each subin- 
terval. For proper intervals, this calculation must be per- 
formed twice for each subinterval, and for improper inter- 
vals, the number of passes is limited by the number of 
subinterval s. 

As each component of the optimizer makes transforma- 
tions to the SLLIC graph, the data flow information becomes 
inaccurate. Two strategies are employed to bring this infor- 
mation up-to-date: patching of the existing data flow infor- 
mation ami partial recalculation. For all optimizations ex- 
cept induction variable elimination, the data flow informa- 
tion can be patched by using information about the nature 
of the transformation to determine exactly how the data 
flow information must be changed. All transformations take 
place within the loop interval in induction variable elimi- 
nation, The update of data flow information within the 
loop is performed by recalculating the local data flow infor- 
mation where a change has been made, and then by prop- 
agating that change out to the loop interval. The affect of 
induction variable elimination on intervals external to the 
loop is limited, and this update is performed by patching 
the data Bow information for these intervals. 

Aliasing 

The com ept of resources has already been presented in 
the earlier discussion of data flow analysis, The optimizer 
provides a component called the resource manager fm use 
throughout the compiler phases. The resource manager is 
responsible for the maintenance of information regarding 
the numbers and types of resoun QB within eat h procedure. 
F,,i example* when the code generate*! needs a aew sym- 
bolic register, it asks the resource manager for one. The 
front ends also allocate resources corresponding to memory 
locations for every variable in each procedure. The re 
sources allocated by the resource manager are called re- 
source numbers. The role of the resource manager is espe- 



cially important in this family of compilers. It provides a 
way for the front end. which deals with memory resoi. 
in terms of programmer variable names, and the optimizer, 
which deals with memory resources in terms of actual 
memory locations, to communicate the relationship be- 
-m the two. 
The most basic use of the resource numbers obtained 
through the resource manager is the identification of 
unique programmer variables. The SLLIC instructions are 
decorated with information that associates resource num- 
bers with each operand. This allows the optimizer to rec- 
ognize uses of the same variable without having to compare 
addresses. The necessity for communication between the 
front ends and the optimizer is demonstrated by the follow- 
ing simplified example of C source code: 



proc{) { 

int L |, k, 



f 



I j - k. 
-p = 1 
i | + fc 



At first glance it might seem that the second calculation 
of j + k is redundant, and in fact it is a common subexpres- 
sion that need only be calculated once. However, if the 
pointer p has been set previously to point to either j or k. 
then the statement *p = 1 might change the value ot eithiM 
j or k. if p has been assigned to point to j, then we say that 
■p and j are aliased to each other. Every front end includes 

a component i ailed i gatherer 6 whose responsibility it is 
to collect information concerning the ways in which mem- 
ory resources in each procedure relate to each other. This 
information is cast in terms of resource numbers, and is 
collected in a similar manner by each front end. Each 
gatherer applies a set of language spec if it alias rules to the 
source. A later component of the Optimizer called the 
aliaser reorganizes this information in terms more suitable 
for use by the hi al data tlnw component of theoptimizer. 

Each gatherer had to solve aliasing problems specific to 
its particular target language. For example, the Pascal 
gatherer was able to use Pascal's strong typing to aid in 
building sets of resources that a pointer of some pariiculai 
type can point to. Since C does not have strong typing, the 
C gatherer could make no such assumptions; The COBOL 
compiler had to solve the aliasing problems that are intro- 
duced with the REDEFINE statement, which can make data 
items look like arrays f ig 4 shows the structure of the 
new compilers from an aliasing perspective. It details data 
and control dependencies. Once the aliasing data has been 
incorporated into the data flow information, every compo- 
nent in the optimizer has access to the information, and 
incorrect program transformations are prevented, 

The aliaser also finishes the calculation of the aliasing 
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relationships by calculating the transitive closure* on the 

aliasing information collected by the gatherers. The need 
tor this calculation is seen in the following skeleton Pascal 
example: 

procedure p; 
begin 

P : integer; 
Q : integer; 



end; 



The aliasing information concerning q must be trans- 
ferred to p, and vice versa, because of the effects of the I wo 
assignment statments shown. The aliaser is an optimizer 
component used by all the front ends, and requires no 
language specific data, Another type of memory aliasing 
occurs when two or more programmer variables can overlap 
with one another in memory. This happens within C unions 
and Fortran equivalence statements. Each gatherer must 
also deal with this issue, as well as collecting information 
concerning the side effects of procedure and function calls 
and the use of arrays. 

"Transiiitfe closure Fq>r a g^ven resource. Ihe gel gf resources Itiat can tie shown to be 
aliased Jo itie given resource by any sequence ot aliasing relationships 



The SLLIC Package 

The SLLIC data structure is allocated, maintained, and 
manipulated by a collection of routines called the SLLIC 
package. Each code generator is required lo use these 
routines, The SLLIC package produces an object file from 
the SLLIC graph it is presented with, which is either op- 
timized or unoptimized. During implementation it was re- 
latively easy to experiment with the design of the object 
file, since its creation is only implemented in one place. 
The object file is designed tn be transportable between 
multiple operating systems running on the same architec- 
ture. 

The SLLIC graph also contains the symbolic debug infor- 
mation produced by the front end. This information is 
placed into the object file by the SLLIC package. The last 
step in the compilation process is the link phase. The linker 
is designed to support multiple operating systems. As much 
as possible, our goal has been for the new compilers to 
remain unchanged across operating systems, an invaluable 
characteristic for application development. 

Addressing RISC Myths 

The new compiling system provides a language develop- 
ment system that is consistent across languages. However, 
each language presents unique requirements to this system. 
Mapping high-level language constructs to a reduced-com- 
plex ily computer requires the development nl new im- 
plementation strategies. Procedure calls, multiplication, 
and other complex operations often implemented in micro- 
code or supported in the hardware can be addressed with 
code sequences tuned to the specific need. The following 
discussion is presented in terms of several misconceptions , 
or myths, that have appeared in speculative discussions 
concerning code generation for reduced-complexity ar- 
chitectures. Each myth is followed by a description of the 
approach adopted for the new HP compilers. 

Myth: An architected procedure cat! instruction is 
necessary for efficient procedure cells. 



MP Pascal 
Front End 



HP Fortran/77 

Front End 



HP COBOL 
Front End 



HPC 

Front End 




Fig* 4. Scheme for the collection of alias information 



Modern programming technique encourages program- 
mers to write smalh well-structured procedures rather than 
large monolithic routines. This tends to increase the fre- 
quency of procedure calls, thus making procedure call ef- 
ficiency crucial to overall system performance. 

Many machines, like the HP 3000, provide instructions 
to perform most of the steps that make up a procedure call. 
The new HP high-precision architecture does not. The 
mechanism of a procedure call is not architected, but in- 
stead is accomplished by a software convention using the 
simple hardwired instructions. This provides more flexibil- 
ity in procedure calls and ultimately a more efficient call 
mechanism. 

Procedure calls are more than just a branch and return 
in the flow of control. The procedure call mechanism must 
also provide for the passing of parameters, the saving of 
the caller's environment, and the establishment of an envi- 
ronment for the called procedure. The procedure return 
mechanism must provide for the restoration of the calling 
procedure's environment and the saving of return values. 

The new HP machines are register-based machines, but 
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by convention a stack is provided for data storage. The 

most straightforward approach to procedure calls on these 
machines assumes that the calling procedure acquires the 
responsibility for preserving its state. This approach em- 
ploys the following si- 

■ Save all registers whose contents must be preserved 
across the procedure call. This prevents the called pro- 
cedure, which will also use and modify registers, from 
affecting the calling procedure's state, On return, those 
register values are restored. 

■ Evaluate parameters in order and push them onto the 
stack. This makes them available to the called procedure 
which, by convention* know r s how to access them. 

■ Push a frame marker. This is a fixed-size area containing 
several pieces of information. Among these is the static 
Jink, which provides information needed by the called 
procedure to address the local variables and parameters 
of the calling procedure. The return address of the calling 
procedure is also found in the stack marker. 

■ Branch to the entry point of the called procedure. 

To return from the call, the called procedure extracts the 
return address trorn the stack marker and branches to it. 
The calling procedure then removes the parameters from 
the stack and restores all saved registers before program 
flow continues. 

This simple model correctly implements the steps 
needed to execute a procedure call, but is relatively expen- 
sive. The model forces the caller to assume all responsihil- 
ity for preserving ils state. This is a safe approach 1 but 
causes too many register saves to occur. To Optimize the 
program's execution, the compiler makes extensive use of 
registers to hold local variables and temporary values. 
These registers must be saved at a procedure call and re- 
stored at the return. The model al.su has n high overhead 
incurred by the loading and storing of parameters and link- 
age information. The ultimate goal ol the procedure call 
convention is to reduce the cost of a call by reducing mem- 
ory acces 

The new compilers minimize ihis problem by introduc- 
ing a procedure call convention that includes a register 
partition. The registers are partitioned into caller-saves (the 
calling procedure Is responsible for saving and restoring 
them], callee-saves [the caller! procedure must save them 
at entry and restore them at exit J. *uu\ linkage registers. 
Thirteen of the 32 registers are in the caller-saves partition 
and 16 are in the callee-saves partition. This spreads the 
responsibility tor saving registers between the calling and 
i a lied prm h« lures and leaves some registers available for 
Linkage 

The register allocator avoids unnecessary register saves 
by using caller-saves registers for values thai need not be 
preserved. Values that must be saved are placed into regis- 
ters from the cal lee-saves partition. At procedure entry, 
only those callee-saves registers used in live procedure are 
saved. This minimizes the number of loads and stun 
registers during the course of a call. The partition of regis- 
ters is not inflexible; if mure registers are needed from a 
particular partition than are available, registers can be bor- 
rowed from the i »lher partition. The penalty for using these 
additional registers is that they must be saved ami restored* 
but Ihis overhead is incurred only when many registers are 



needed, not for all calls. 

In the simple model, all parameters are passed by being 
placed on the stack. This is expensive because memory 
references are made to push each parameter and as a con- 
sequence the stack size is constantly altered. The new com- 
pi lers allocate a permanent parameter area large enough to 
hold the parameters for all calls performed by the proce- 
dure. They also minimize memory references when staring 
parameters by using a combination of registers and memory 
to pass parameters. Four registers from the callee-saves 
partition are used to pass user parameters; each holds a 
> k 32-bit value or half of a 64-bit value. Since proce- 
dures frequently have few parameters, the four registers 
are usually enough to contain them alb This removes the 
necessity of storing parameter values in the parameter area 
before the call. If more than four 32-bit parameters are 
passed, the additional ones are stored in the preallocated 
parameter area. If a parameter is larger than 64 bits, its 
address is passed and the called procedure copies it to a 
temporary area. 

Additional savings on stores and loads occur when the 
called procedure is a leaf routine. As mentioned previously, 
the optimizer attempts to maximize the use of registers to 
hold variable values. When a procedure is a leaf f the register 
allocator uses the caller-saves registers for this purpose 1 
thus eliminating register saves for both the calling and 
called procedures. It is never necessary to store the return 
address or parameter registers of a leaf routine since they 
will not be modified by subsequent calls. 

Leaf routines do not need to build a stack frame, since 
they make no procedure calls. Also, if the allocator suc- 
ceeds in representing all local variables as registers, it is 
not necessary to build the local variable area at entry to 
the leaf procedure, 

the convention prescribes other uses of registers to elimi- 
nate other loads and stores at procedure calls, The return 
address is always stored in a particular register h as is the 
static link if it is needed. 

To summarize, the procedure call convention used in 
the new HI 1 computers streamlines the overhead of proce- 
dure calls by mi nimi zing the number of memory references. 
Maximal use of registers is made to limit the number of 
memory accesses needed to handle parameters and linkage. 
Similarly* the convention minimizes the need to store val- 
ues contained in registers and does not interfere with at- 
tempts at optimization. 

Myth. The simple instructions available in RISC 
result in signiflconi code expansion. 

Many applications* especially commercial applications, 
assume the existence of complex high-level instruct if ms 
typically implemented by the system architecture in micro- 
code or hardware. Detractors of RISC argue that significant 
code expansion is unavoidable since the architecture tacks 
these instructions, Early results do not substantiate this 
argument. 78 The new HP architecture does not provide 
complex instructions because of their impact on overall 
system performance and cost, but their functionality is 
available through other means, 

As described in an earliei article, 2 the new HP machines 
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1 1 not have a microcoded architecture and a]J of the in- 
structions are implemented in hardware. The Instructions 
on microcoded machines are implemented in two ways. 
At the basic level, instructions are realized in hardware. 
More complex instructions are then produced by writing 
subroutines of these hardware instructions. Collectively, 
these constitute the microcode of the machine, Which in- 
structions are in hardware and which are in microcode are 
determined by the performance and cost goals for the sys- 
tem. Since HP's reduced instruction set is implemented 
solely at the hardware level, subroutines of instructions 
are equivalent to the microcode in conventional architec- 
tures* 

To provide the functionality of the complex instructions 
usually found in the architecture of conventional machines, 
the design team developed the alternative concept of mii- 
JirfxJe instructions or routines, Millicode is HP's imple- 
mentation of complex instructions using the simple hard- 
ware instructions packaged into subroutines, Millicode 
serves the same purpose as traditional microcode, but is 
common across all machines oi Ihe family rather than spe- 
cific to each, 

The advantages of implementing functionality as mil- 
licode are many. Microcoded machines may contain hid- 
den performance penalties on all instructions to support 
multiple levels of instruction implementation, This is not 
the case for millicode. From an architectural viewpoint, 
millicode is just a collection of subroutines indistinguish- 
able from other subroutines, A millicode instruction is exe- 
cuted by calling the appropriate millicode subroutine. 
Thus, the expense of executing a millicode instruction is 
only present when the instruction is used. The addition of 
millicode instructions has no hardware cost and hence no 
direct influence on system cost. It is relatively easy and 
inexpensive to upgrade or modify millicode in the field, 
and it can continue to be improved, extended, and tuned 
over time. 

Unlike most microcode, millicode can be written in Ihr 
same high-level languages as other applications, reducing 
development costs yet still allowing for optimization of 
the resultant code. Severely performance-critical millicode 
can still be assembly level coded in instances wtiere the 
performance gain over compiled code is justified. The size 
of millicode instructions and the number of such instruc- 
tions are not constrained by considerations of the size of 
available control store. Millicode resides in the system as 
subroutines in normally managed memory, either in virtual 
memory where it can be paged into and out of the system 
as needed, or in resident memory as performance consid- 
erations dictate. A consequence of not being bound by re- 
strictive space considerations is that compiler writers are 
free to create many more specialized instructions in mil- 
licode than woold be possible in a microcoded architecture, 
and thus are able to create more optimal solutions for spe- 
cific situations. 

Most fixed instruction sets contain complex instructions 
that are overly general. This is necessary since it is costly 
to architect many variations of an instruction. Examples 
of this are the MVB [move bytes) and MVW (move words) 
instructions on the HP 3000, They are capable of moving 
any number of items from any arbitrary source location to 



any target location. Yet, the compiler's code gener 
frequently have more information available about the 
operands of these instructions that could be used to advan- 
tage if other instructions were available. The code generators 
frequently know whether the operands overlap, whether 
t he o pera nds are aligned fav ots b I y , a n d t h e n u m her of i t ems 
to be moved, On microcoded machines, this information 
is lost after code generation and must be recreated by the 
microcode during each execution of the instruction. On 
the new HP computers, the code generators can apply such 
information to select a specialized millicode instruction 
that will produce a faster run-time execution of the opera- 
tion than would be possible for a generalized routine. 

Access to millicode instructions is through a mechanism 
similar toa procedure call. However, additional restrictions 
placed on the implementation of millicode routines pre- 
vent the introduction of any barriers to optimization. Mil- 
licode routines must be leaf routines and must have no 
effect on any registers or memory locations other than the 
operands and a few scratch registers. Since millicode calls 
are represented in SLL1C as pseudoinstructions. the op- 
timizer can readily distinguish millicode calls from proce- 
dure calls. Millicode calls also use different Linkage regis- 
ters from procedure calls, so there is no necessity of preserv- 
ing the procedure's linkage registers before invoking milli- 
code instructions. 

The only disadvantage of the millicode approach over 
microcode is that the initiation of a millicode instruction 
involves an overhead of at least two instructions. Even so, 
it is important to realize that for most applications, mil- 
licode instructions are infrequently needed, and their over- 
head is incurred only when they are used. The high-preci- 
sion architecture provides the frequently needed instruc- 
tions directly in hardware. 

Myth: RISC machines must implement integer 
multiplication as successive additions, 

Integer multiplication is frequently an architected in- 
struction. The new architecture has no such instruction 
but provides others that support an effective implementa- 
tion of multiplication. It also provides for inclusion of a 
high-speed hardware multiplier in a special function unit, J 

Our measurements reveal that most multiplication oper- 
ations generated by user programs involve multiplications 
by small constants. Many of these occurrences are explicitly 
in the source code, but many more are introduced by the 
compiler for address and array reference evaluation. The 
new compilers have available a trio of instructions that 
perform shift and add functions in a single cycle- These 
instructions, SH1ADD (shift left once and add), SH2ADD [shift 
left twice and add) and SH3ADD (shift left three times and 
add) can be combined in sequences to perform multiplica- 
tion by constants in very few instructions. Multiplications 
by most constants with absolute values less than 1040 can 
be accomplished in fewer than five cycles. Negatively 
signed constants require an additional instruction to apply 
the sign to the result. Multiplication by all constants that 
are exact powers of 2 can be performed with a single shift 
instruction unless overflow conditions are to be detected, 
Additionally, multiplications by 4 or 2 for indexed address- 
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ing can be avoided entirely. The LDWX (load word indexed) 
and LDHX (load half-word indexed! instructions optionally 
perform unit indexing, which combines multiplication of 
the index value with the address computation in the 
hardware. 

The following examples illustrate multiplication by vari- 
ous small constants. 



Source code: 

4*k 
Assembly code: 

SH2ADD 8.0.9 



Source code: 
-163*k 
Assembly code: 

SH3ADD 8,8.1 

SH3ADD 1.1.1 

SH1AD0 1,8,1 

SUB 0,1.1 

Source code: 

A(k) 
Assembly code: 

LDO -404(30),9 

LDW 56(0,30)7 

LOWX.S 7(0,9),5 



shift rS(k) left 2 places, 
add to rO (zero) inters 



shift r8 M left 3 places, add 

toriseif rntori 

shift n left 3 places, add to 

itself into n 

shift rl leftl place, add to 

kintorl 

subtract result from to 

negate; back into r 1 



load array base address 

intor9 

load unit inde* value into r7 

multiply index by 4 and 

load element into r5 



When neither operand is constant or if the constant is 
mji h thai tin- in-line code sequence would be too large, 
integer multiplication is accomplished with n millicode 
instruction, The multiply millicode instruction operates 
under the premise that even when the operands are un- 
known at compile time, one oj them is still Ufcely fcb he a 
small value. Application of this to the multiplication al- 
gorithm yields an average multiplication lime of 20 cy- 
cles.whii h is Comparable to an Iterative hardware im- 
plementation. 

Myth: RISC machines cannot support commercial 
applications languages* 

A popular myth about RISC architectures is that they 
cannot effectively support languages like COBOL. This be- 
lief is based on the premise that RISC architectures Cannot 
provide hardware support for the constructs and data types 
of COBOL-like languages while maintaining the one-in- 
struction-one-cycle advantages of RISC As a GOD sequence, 
some feel thai the code expansion resulting from perform- 
ing COBOL operation using only the simple architected 
instructions would he prohibitive. The significance of this 
is often overstated. Instruction traces of COBOL programs 
measured on the HP 3000 indicate that the frequency of 
dot imal arithmetic instructions is very lew. This is bet 
much of the COBOL program's execution lime is speul in 
the operating system and other subsystems 



COBOL does place demands on machine architects and 
compiler designers that are different from those of lan- 
guages like C. Fortran, and Pascal. The data items provided 
in the latter languages are represented in binary and hence 
are native to the host machine, COBOL data types also in- 
le packed and unpacked decimal, which are not com* 
monly native and must be supported in ways other than 
directly in hardw 

The usual solution on conventional machines h to pro- 
vide a commercial instruction set in mi< fhese ad- 
ditional instructions include those that perform COBOL 
field (variable) moves, arithmetic for packed decimal val- 
ues, alignment, and conversions between the various arith- 
metic types. 

In the new HP machines, tnillicode instructions are used 
to provide the functionality of a micmcoded commercial 
action set. This allows the encapsulation of COBOL 
operations while removing the possibility of runaway code 
expansion. Many COBOL millicode instructions are avail- 
able to do each class of operation. The compiler expends 
considerable effort to select the optimal millicode opera- 
tion based on cumpile-time information about the opera- 
tion ant] its operands. For example, to generate code to 
perform a COBOL field move, the compiler may consider 
the operand's relative and absolute field sizes and whether 
blank or zero padding is needed before selecting the appro- 
priate millicode instruction. 

Hardware instructions that assist in the performance of 
some COBOL operations are architected. These instruc- 
tions execute in one cycle but perform operations that 
would otherwise require several instructions. They are 
emitted by the compiler in in-line code where appropriate 
and are also used to implement some of the millicode in- 
structions. For example, the DCOR [decimal correct) and 
UADDCM [unit add complement) instructions allow packed 
decimal addition to be performed using the binary ADD 
instruction. UADDCM prepares an operand for addition and 
the DCOR restores the result to packed decimal form after 
the addition, For example: 

rt and r2 contain packed decimal operands 
r3 contains the constant X'99999999 



UADDCM 1,3,31 
ADD 2,31.31 

DCOR 31,31 



pre-bias operand into r31 
perform binary add 
correct result 



Sicode instructions support arithmetic for both 
packed and unpacked decimal data. This is a departure 
from the HP 3000, since on that machine unpacked [in Mi- 
metic is performed by first converting the operand to 
packed format, performing the arithmetic operation on the 
packed data, and then converting the result hack to un- 
pai ked :<'|'i.'sentation. Operations occur frequently enough 
mi unpacked data to justifv the implementation of un- 
[nuked arithmetic routines. The additional cost to imple- 
ment them is minimal and avoids I he overhead of convert- 
ing operands between the two types. An example of the 
code to perform an unpacked decimal add is: 
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r1 and r2 contain unpacked decimal operands 
r3 contains the constant X 96969696' 

r4 contains the constant X'OfOfQfOf 
r5 contains the constant X 30303030' 



ADD 3,1,31 

ADD 31,2,31 

DCOR 31,31 

AND 4,31,31 

OR 5,31,31 



: p re ■ b ias ope rand into r3 1 

: binary add into r31 

; correct result 

; mask result 

; restore su m to u n packed d ec i mat 



In summary, COBOL is supported with a blend ui 
hardware assist instructions and millicode instructions. 
The compiled code is compact and meets the run-time 
execution performance goals. 



Conclusions 

The Spectrum program began as a joint effort of hardware 
and software engineers. This early communication allowed 
high-level language issues to be addressed in the archi t ec- 
ru ral design, 

The new HP compiling system was designed with a re- 
duced-complexity machine in mind. Register allocation, 
instruction scheduling, and traditional optimizations allow 
compiled programs to make efficient use of registers and 
low-level instructions. 

Early measurements have shown that this compiler tech- 
nology has been successful in exploiting the capabilities 
of the new architecture. The run-time performance of com- 
piled code consistently meets performance objectives. 
Compiled code sizes for high-level languages implemented 



An Optimization Example 



This example illustrates me code generated for the following 

C program for both the unoptirnized and the optimized case. 

test ( ) 

{ 

inl i P j; 

int 31125], a£[25], r[25][25j; 

For (i = 0; i < 25 f \++)\ 

for (j = 0; i < 25; j - + ) { 
'DIG] - al [i)*a2UJ; 



In the example code that follows, the following mnemonics are 
used 

return pointer, containing the 

address to which control should 

be returned upon completion of 

the procedure 

first parameter register 

second parameter register 

slack pointer, pointing to the top 

of the current frame 
mretO mi 1 1 icode return register 

m rp mlH icode return pointer. 



rp 



argo 
argi 
sp 



The value of register zero (rO) fs always zero 

The following is a brief description of the instructions used 
r2^r1 + immed. 
r2*_*(ri f immed) 
r3*- *(4*t1 i r2j 
*(r2 + immed) — n 
"(r2 + Immed)-— r1 

*(r2 + immed)*- rl ANDr2«-r2 + immed 
ifrl < - r2, branch lo label 
branch to label, and put return address rnto r1 (for 
procedure call) 

branc h (o address in r 1 ( for p roced u re retu m ) 
r3*-M ■ r2 
r3*~2*H + r2 
r3«-4*r1 + r2 



LDO 


immed(r1)j2 


LDW 


immedfrl),^ 


LDWX,S 


ri(r2).r3 


STW 


n jmmed(r2J 


STWS 


r1 ,immed(r2) 


STWM 


rl ,immedfr2) 


COMB,< * 


-r1,r2, label 


BL 


label.rl 


BV 


QW) 


ADD 


M,r2,r3 


SH1 ADD 


r1.r2.r3 


SH2AOD 


n ,r2.r3 



SH3ADD 


r1,r2.r3 


r3 — B-rl + r2 


COPY 


r1 ,r2 


ra«-rt 


NOP 




no effect 



In the following step-by-step discussion, the unoptirnized code 
on the left is printed in black, and the optimized code on the 
right is printed in color. The code appears in its entirety, and 
can be read from the top down in each column. 

Save caflee-saves registers and increment stack pointer Un- 
optirnized case uses no registerthat needsto be live across a call 

LDO 2760(sp).sp STW 2, -20(0. sp I 

STWM 3,276a(0\5p) 

STW 4, 2764(0 spj 

Assign zero to i. In the optimized case, i resides in register 1 9 
STW rj, - 52(0. sp) COPY 

Compare i to 25 This lest is eliminated in the optimized case 

since the vaJue of i is known 



LDW 
LDO 
COMB,- 



52(0,sp),1 

25(0). 31 
-,N 31.1.L2 



In the optimized version, a number of expressions have been 
moved out of the loop- 



{maximum value of j} 


LDO 


25(0), 2D 


{address of a 1 [ 


LDO 


-155(sp),22 


•'address of a2} 


LDO 


-256(sp},24 


* address of r } 


LDO 


2756{sp),28 


{initial value of 1 QQ 4 1 1 


LOO 


0iO\A 


^maximum value of 1 00*i) 


LDO 


25QQ(G).2 



Initialize j to zero, and compare j to 25. This lest has also been 
eliminated in the optimized version, since the value of j is known 
Note that i now resides in register 21. 



L3 



STW 
LDW 



0,-56{Q.sp) 
-56(0\SpKl9 



COPY 



0.21 
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in this low- level instruction set are comparable to those 
for more conventional architectures. Use of millicode in- 
structions helped achieve this result. Complex high-level 
language operations such as procedure calls, multiplica- 
tion, and COBOL constructs have been implemented effi- 
ciently with the low-level instructions provided by the 
high-precision architecture. A laler paper will present per- 
formance measurements. 
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LDO 
COMB,-::- 



25(01,20 
20,1 9,L1 



In the optimized version, the load of a1[i] is moved out of the 
inner loop, since the value of i is constant in the inner loop. 

LOWX,S 19(Q.22).23 

Register 28 contains the address of r, and register 4 contains 
the value 100+I, which is the offset of the ith raw of array r. This 
is constant over the inner loop, and has been moved out. 



ADD 



28.4,3 



L6 



The loop begins with the load of a1[ij into the first parameter 
register This value has already been loaded jn the optimized 
version, and need only be copied. 



LDO 


-156(sp),21 


LDW 


52{G,sp),22 


LDWX,S 


22{0 F 2l),argO 



COPY 



23.argQ 



The value of a2[j] is loaded into Ihe second parameter register, 

and the multiply millicode instruction is called. In the optimized 
case, Ihe address of a2[0] and the value of j are both already 
in registers 



LDO 


-256<sp),1 


LDW 


-56(0 f sp) r 19 


BL 


mulLmrp 


LDWX,S 


19(0,1 J.argl 



BL 
LDWX.S 



muil.mrp 

21 (0.24 J.argl 



Store the result into r[i][j] The three SHxADD instructions cal- 
culate 100*i- Note that most of the following is loop invariant, 
and has been moved out of the loop in the optimized case. 



LDO 

LDW 

SHtADD 

SH3ADD 

SH2ADD 

ADD 

LOW 



2756{sp) b 19 {address of r} 



- 52(0. sp), 20 
20,20,21 

21 ,20,22 
220,1 

19,1.31 
56(0«sp),19 



lvalue of i 

{r2i«- 3«i| 
{r22— 2S*J} 

{address of r + 100* 
lvalue of j} 



SH2ADD 19.31,20 {addj*4 to address \ 

STWS mretO, 0(0,20) {stored 



SH2AOD 21.3,31 
STWS mretO,OfO,31) 



Increment \ 



LDW 
LDO 
5TW 



-56(0,sp),2l 
1(21). 22 
22.-56t0.sp) 



LDO 



1(21) .21 



Compare j to the value 25 (already in register 20 in the op- 
timized version) The position after the conditional branch con- 
tains no useful instruction in the unoptimized case. In the op- 
timized version, the first instruction of the loop has been copied 
to this position, and the target adjusted to the following instruction, 
Because the branch has the nullification flag set (,N), the following 
instruction will not be executed when the branch is not taken. 



LDW 5G(0,$p>,1 

LDO 25(0}, 31 

COMBF.<- 31.1.L6 
NOP 



COMBF, -.H 20.21.L6-4 
LDWX..S 21 (0.24) .25 



LI 



Increment I, and test for the end of the loop In the optimized 
version, induction variable elaboration has removed the 100>i 
multiplication, and added a new induction variable to contain 
that value. This value, in register 4, is now tested against a 
maximum value of 2500, contained in register 2, This branch has 
been scheduled like the previous branch. 



LDW 


-52(0.sp),19 


LDO 


1(19), 20 


STW 


20, 52(0.sp) 


LDW 


- 52(0,sp)<21 


LDO 


25{D),22 


COMBF,-- 


= 2221, L3 


NOP 





LDO 


in ana 


LDO 


100(4) K 4 


COMBF,- 


-.N2AL3 + 4 


COPY 


0,21 



L2 



Finally, the registers are restored, and control is returned to 
the calling procedure 







LDW 


- 2788(0.5p).2 






LDW 


- 2764(0, sp J. 4 


BV 


0(rp) 


BV 


0(rp) 


LDO 


-2760(sp),sp 


LOWM 


2768<0.!spU 
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A Stand-Alone Measurement Plotting 
System 

This compact laboratory instrument serves as an X-Y 
recorder, a tow-frequency waveform recorder r a digital 
plotter, or a data acquisition system. 

by Thomas H. Daniels and John A. Fenoglio 



MANY PHYSICAL PHENOMENA arc characterized 
by parameters that are transient or slowly varying, 
[f these changes can be recorded, they can be ex- 
amined at leisure and stored for future reference or com- 
parison. To accomplish this recording, a number of elec- 
tromechanical instruments have been developed, among 
I hem the X-Y recorder. In this instrument, the displacement 
along I he X-axis represents a parameter of interest or time, 
and the displacement along the Y-axis varies as a function 
of yei another para meter. 

Such recorders can be found in many laboratories record- 
ing experimental data such as changes in temperature, vari- 
ations in transducer output levels, and stress versus applied 
strain, to name just a few. However, the study of more 
complex phenomena and the use of computers For storage 
of data and control of measurement systems requires en- 
hancement of the basic X-Y recorder. Meeting the need h 
Hewlett-Packard's next- generation laboratory recorder, the 
HP 7090A (Fig, 1), is a compact stand-alone instrument 



that can be used as a conventional X-Y recorder, a low-fre- 
quency waveform recorder, a digital plotter, and a complete 
data acquisition system. 

X-Y Recorder Features 

The HP 7090 A Measurement Plotting System offers many 
improvements tor the conventional X-Y recorder user. In 
the past, X-Y recorders have been limited to a frequency 
response of a few hertz by the response time of the mech- 
anism. The HP 7090A uses analog-to-digital converters 
I A DCs) and digital buffers to extend the measurement 
bandwidth well beyond the limits of the mechanism. Bach 
input channel has a 12-bit ADC capable of a 30-kH/ sample 
rate. Since it is necessary to have about 10 samples/cycle 
for a good plot of the signal (remember, the minimum 
Nyquist rale of two samples/cycle only applies if there is 
a perfeel low-pass on I put filter], this approach allows sig- 
nals with band widths up to 3 kHz to be recorded. 

The front-end amplifier presented many design chal- 




Fig. 1. The HP 7Q90A Measure- 
ment Plotting System combines 
many of the features of an X-Y re- 
corder, a low-frequency waveform 
recorder, a digital plotter, and a 
data acquisition system in one in- 
strument that can be operated by 
itself or as pan of a larger com- 
puter*controlted system 
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lenges. High common-mode rejection, high sensitivity, low- 
noise, and static protection were a few of the more difficult 
areas, X-Y and stripchaxt recorders have used floating input 
circuitry to allow users maximum flexibility in connecting 
signals to the measuring device. The degree to which input 
signals can be isolated from chassis ground is specified as 
the common mode rejection (CMR1. Achieving a high CMR 
means that the input circuitry must not be- ted to 

sts ground. This requirement posed a dilemma for a 
microprocessor-controlled system like the HP 7090 /V be- 
cause the microprocessing system must be connected to 
ground for noise prevention reasons. This design contradic- 
tion is resolved by using small independent power supplies 
for the front-end channels and by doing all of the data 
communication via optoisolator links. The point in the 
system where the floating circuitry is connected to Mm 
processing circuitry ts shown by the optoisolator in the 
system block diagram (Fig. 2). 

The most sensitive range of the HP 7090A is 5 mV full 
scale. The 12-bit resolution of the ADC allows measure- 
ments as low as 1 /*,V. input amplifier noise and all external 
switching noises must be kept well below 1 fiV over the 
hill 3-kHz bandxvidth. In addition, the standard HP design 
requirement of electrostatic discharge protection offered 
as even greater challenge — the same high-sensitivity float- 
ing input must be able to withstand 25-kV discharges di- 
rectly to the input terminals! (See article on page 32 for 
details about the front -end design.) 

The microprocessor is used for many functions, includ- 
ing signal processing on the raw analog-to-digital measure- 
ments. This makes it possible to calibrate the instrument 
digitally, Hence, there are no adjustment potentiometers 
in the HP 7090A (see box on page 22). During the factory 
calibration, a known voltage is applied to I he in] mt diul 
the microprocessor reads the corresponding value at the 
output of the ADC. The calibration station I ben compares 
this value with the expected value. Any small deviation 
between the measured and expected values is converted 
to a calibration constant that is stored in the HP 7090A's 
nonvolatile memory (an electrically erasable, programma- 
ble read-only memory. orEEPROM). This constant is used 
by the Interna] microprocessor to convert raw measurement 
data to calibrated measurement data during the normal 



operation of the instrument. In addition, offset errors are 
continually checked and corrected during measurements. 
This helps eliminate the offset or baseline drifts normally 
associated with high-sensitivity measureme: 

The use of a microprocessor also allows the user of an 
HP 7090 A to select a very large number of calibrated input 
voltage ranges. Conventional approaches to input ranging 
usually involve mechanical attenuator switches with about 
fourteen fixed positions corresponding to fourteen fixed 
ranges, An uncalibrated vernier potentiometer is used for 
settings between the fixed ranges, The HP 7090A uses dig- 
itally programmable preamplifiers and attenuators. The 
gain of this circuitry 1 can be set to 41,000 different values, 
T|ie microprocessor commands different gain settings by 
writing to the front-end control circuitry via the opto- 
isolator link. 

Low-Frequency Waveform Recorder Features 

The HP 7 090 A also can be used as a low-frequency 
waveform recorder. Triggering on selected input signal con- 
ditions allows a waveform recorder to capture transient 
events. In the HP 7090A. the triggering modes are expanded 
from the traditional level-and-slope Iriggering to include 
two modes of window triggering. The outside window 
mode allows for triggering on signals that break out of 
either an upper or a lower window boundary. The special 
inside window mode allows for triggering when the signal 
stays inside upper and lower window boundaries for the 
entire measurement period. The latter is I he only effective 
way to trigger on a decaying periodic waveform like that 
caused by an ac power line failure [Fig. 3). 

To implement the sophisticated triggering capability de- 
scribed above, the HP 7090A uses digital triggering tech- 
niques. No analog circuitry Is involved. The trigger decision 
is made by looking at the digitized inpul data that comes 
from the ADCs and comparing this to the desired trigger 
conditions set by I he user. At the higher sampling rates 
tlu.' microprocessor is not fast enough to make trigger deci- 
sions unaided. Therefore, a semicustom LSI circuit is used 
in augment the processor in this area. This IC is a CMOS 
770-gate array especially programmed to do input data buf- 
fer management, ft is shown in the system block diagram 
as the front-end gate array. 



Analog t 
Input * 
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>- Paper 
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Fig. 2. Simplified block diagram 

of the HP 70QQA. 
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Fig. 3. The HP 7090 A's special inside window triggering 
mode allows capture of waveforms that remain inside the 
window for the measurement period in the above example, 
the trigger occurs after the thousandth consecutive sample 
thatts inside the window defined by setting the TRIGGER LE vel 
and trigger width controls on the front panel. This enables 
the recording of such events as a decaying periodic waveform 
caused by an a c iine failure. 



One final measurement feature is the highly accurate, 
wide-dynamic-range, time-axis capability that comes about 
because the HP 7090A's time axis is computed by dividing 
the system's crystal-controlled clock frequency. This al- 
lows for time sweeps from 0.03 second to 24 hours full 
scale. 

Recorder/Plotter Features 

The desire to produce a single product that could be 
used as a continuous X-Y recorder and as a full-perfor- 
mance digital plotter created many different performance 
objectives for the sophisticated servo system found in the 
HP 7O90A. It uses three separate servo algorithms, each 
optimized for a specific task, In the digital plotter mode, 
the servo must match both axes and faithfully draw straighl 
line vectors between end points. 

Plotting data from the digitized input signal buffers also 
requires the servo to draw vectors between the data points, 
but there is a subtle difference. In this case, the servo can 
be optimized to look ahead at the next few data points and 



Eliminating Potentiometers 



Potentiometers are not needed in the HP 7090 A Measurement 
Plotting System because its internal microprocessor: 

■ Controls the front end 

■ Determines the gam constants 
* Performs the offset calibration 

■ Corrects tf\e data. 

The microprocessor has the ability to write to three main ports 
in the from channel (see Fig, 1 ), The first port controls the FET 
switches and relays that govern the coarse gain settings and 
the relay that passes the input signal into the front-end amplifiers. 
The second port determines the amount of offset fed into the 
input signal The third port establishes The attenuation that the 
signal sees by means of the digitally programmable amplifier 
This port governs the fine gam settings. 

There are 14 coarse gain settings covering a span of 5 mV to 
100V, inclusive. While an HP 7090A is on the assembly line, it 
passes through a calibration of the gain of each of the Ihree 
channels at each of the coarse gain settings. This calibration 
procedure produces a two-byte number for each channel at each 
setting, and then stores these numbers in nonvolatile memory. 



To determine these numbers, an offset calibration is performed 
(as discussed later) and a voltage equal to the fuH-scale voltage 
is placed on the inputs ot the channel. For example, if the full- 
scale voltage of the front end rs set to 200 mV, a 200-mV dc 
signal is placed on the inputs. The buffer is filled usmg a 250-ms. 
measurement trme base and 200 of the uncorrected analog-to- 
digital samples are sent over the HP 7090-A's HP-IB (IEEE 488) 
to the controller, an HP Series 200 Computer. These samples 
are not internally corrected, they are the direct output of the ADC 
in the instrument's front end. These samples are averaged, and 
the average A is put into the following formula, 



Gain constant 



/ 1974 \ / DVM \ 

\A-S-204fi/ V Ideal Volts } 



where DVM is the voltage read by a digital voltmeter of the input 
voltage to the front end. and ideal Volts corresponds to the full- 
scale voltage that should be on the input S is the software offset 
found by the offset calibration. The typical result is about 1.03. 
The word stored in the nonvolatile memory is the gain constant 
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Fig. 1 . Block diagram of iron t-enti 

section of the HP 709QA. 
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Fig. 2. Flow chart of front-end catih ration procedure for each 

channel of the HP 7090A, 
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e «s some offse" 
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— e ctock e HP 
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Tne procedure (Fig. 2) followed for correcting the offsets \n 
one channel begins - 3 the input relay— the c- 

allows the input signal to pass through the front end Nex* 
is tur- - ch grounds The input to the amplifiers There 

are appropriate delays to let the relay and FET debounce arid 
settle to fixed values The processor ;s then abie to induce the 
ADC to convert the zero input twice The two samples come from 
the two sample-and-hold sections within the front end The result- 
ing values are stored in RAM Next the offset port is written with 
a number equal to one plus the original value The processor 
fnduces two more conversions, and the new values are compared 
with the previous values stored >n RAM If the new pair of values 
is closer to the desired zero value, based on internal computa- 
tions of the range and offset settings, the offset port value is 
incremented again and the process of comparison is repeated 
If the new values are farther than the previous set from the desired 
value then the offset port value is decremented twice, and two 
new values are found and compared with those for the original 
offset port number, If the new values are closer to the desired 
value, the offset port value is decremented once and the process 
ss repeated. The process stops when the most recent values 
from the ADC are farther than the previous values from the desired 
value 

The processor reverses the trend of incrementing or decre- 
menting the offset port value once leaving the offset DAC at its 
optimal value, takes 16 samples one millisecond apart for each 
sample-and-hold. and averages these samples to eliminate any 
60-Hz nosse The two averages have the desired offset value 
subtracted from them, and the two differences are stored in RAM 
The result is that the offset port is at its optimal value and two 
words are stor d that correspond to the residual offsets 
of the front end and each sample-and-hold, These words are 
called the software offsets, and are used in correcting the data. 
The zero FET .s turned off and the input relay is closed The front 
end is now calibrated and ready for sampling the external input 

When the ADC samples data, its output must be corrected for 
gain and offset Each time a conversion takes place, a 10- bit 
counter is incremented and the least significant bit is the index 
for which sample-and-hold (1 or 2) corresponds to the data sam- 
ple The uncorrected data is inserted into ihe following formula , 

(D, - V„ 51 - Ideal Zero) x GF(J) ^ Ideal Zero = D corfl 

where D corresponds to The uncorrected data of sample-and- 
hold i (i = 1 or 2), V osl equals the software offset for sample-and- 
hold i, Ideal Zero is the binary equivalent of the offset scaled to 
to 4095 where 2048 represents a zero offset, and GF(J) is Ihe 
gain factor word stored m the EEPROM plus a word for range J 
(J=1 through 14. corresponding to the 5-mV Through 100V 
rang* 

Stephen D. Goodman 

Development Engineer 
San Diego Division 
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adjusl its acceleration profile to reduce the plot time by 
removing the need to come to a complete stop after each 
data point. When in the RECORD DIRECT mode, the digitized 
input signal data is fed directly to the servo control system, 
bypassing the data buffers, and the pen follows the input 
signal in the continuous (non vector) manner of conven- 
tional X-Y recorders. 

The servo system uses the familiar dc motors and optical 
position encoders that are common to all modern digital 
plotters. But unlike such plotters, this servo system uses 
an algorithm that closes the servo loop and allows the 
device to emulate the analog- like characteristics of tradi- 
tional X-Y recorders. This is done by using the micropro- 
cessing system and another semicustom LSI circuit, a 
CMOS 2000-gate array. This hardware combination allows 
the processing system to model the characteristic block 
diagram of a traditional analog servo system in a manner 
fast enough to appear real-time to the user when recording 
slow-mo vi ii g signals (under a few cycles per second]. In 
this mode, the HP 7090A performs in exactly the same 
manner as a conventional X-Y recorder, 

Another feature of the HP 7090 A is its ability to draw 
its own grids. No longer is the user forced to try to align 
the desired measurement to a standard inch or metric grid. 
The user simply specifies the required number of grid di- 
visions, from one to one hundred, by using the HP 7090A's 
front-panel controls, A firmware algorithm is invoked by 
pressing the front-panel GRID button, which then draws 
the specified grid between the specified zero and full-scale 
points. 

The graphs created by the HP 7090A can be used for 
observing the trends of the measurement. The high -accu- 
racy measurement made possible by the 12-bit ADC can 
be appreciated further by using the internal character 



generator to annotate any desired data point with three- 
digit resolution. 

The processor also makes possible other features thai 
enhance the measurement display capability of the HP 
7090 A. A calendar clock JC backed up with a battery and 
connected to the processor can be used to provide labeling 
of time and date at the push of a front-panel button. A 
nonvolatile memory (EEPROM) IC stores front- panel setup 
conditions, and two internal digital-to- analog converters 
convert digital data in the buffer memory to analog signals 
that can be displayed oo a conventional oscilloscope to 
preview the buffer data, if desired, before plotting. 

Data Acquisition System Features 

The HP 7090A can be used as a computer-interfaced data 
acquisition system by using its built-in HP-IB (IEEE 488) 
I/O capabilities. All setup conditions and measurements 
can be controlled remotely by using an extension of the 
HP-GL (Hewlett-Packard Graphics Language] commands 
tailored for measurements. The data in the buffer can be 
transferred to a computer. The computer can process the 
data and then address the HP 7090A as a plotter to display 
the results. 

The HP 17090A Measurement Graphics Software pack- 
age (see article on page 27) was developed to provide user- 
friendly access to the many measurement capabilities of 
the HP 7090A, 

Ac know led g me nts 

The final design challenge was to offer the above capabil- 
ities without increasing the price above that of a conven- 
tional X-Y recorder, We would like to thank the many 
departments of HP's San Diego Division that helped make 
this dream a reality. 



Digital Control of Measurement Graphics 



by Steven T. Van Voorhis 



THE OBJECTIVE of the servo desigo team for the HP 
7090 A Measurement Plotting System was to develop 
a low-cost servo capable of producing quality hard- 
copy graphics output, both in real-time directly from the 
analog inputs and while plotting vectors either from the 
instrument's internal data buffer or received over the HP-IB 
(IEEE 488) interface. The mechanical requirements of the 
design were met by adopting the mechanics of the earlier 
HP 7475A Plotter. This approach had the significant advan- 
tage of a lower-cost solution than could have been achieved 
with a new design. What remained then was to design the 
electronics and firmware for reference generation and con- 
trol of the plant [dc servo motor and mechanical load]. 



Servo Design 

Fig. 1 is a block diagram of the major components of the 
HP 7090 A servo design for one axis, there being no signific- 
ant difference between the pen and paper axes for the pur- 
poses of this discussion. Fig. 2 shows the corresponding 
servo modei. The plant is modeled as a system with the 
transfer function of K m ;[s+PJ(s+PJ. Feedback of position 
and velocity was found to give sufficient control to meet 
the line-quality objectives, 

The prime mover for each axis is a low-cost dc servo 
motor. Feedback of motor shaft position is provided by a 
500- line optical encoder. By detecting all state changes of 
the two-channel quadrature output of the encoder. 2000 
encoder counts per revolution of the motor shaft can be 
detected, This yields an encoder resolution of slightly bet- 
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Fig. 1 . Block diagram of HP 7090A 
servo system 



tor than O.QOl inch at the pen tip. Since the feedback is 
derived from the motor shaft and not the pen tip, any plant 
dynamics between these two points are open-loop with 
respect to the servo system, It is therefore essential that 
the mechanics be "stiff" between the motor shaft and the 
pen tip wilhin the 100-Hz bandwidth of the servo system. 

The digital electronics portion of the control loop is im- 
plemented in a single gate array of some 2000 gates pack- 
aged in a 40-pin dual in-line package. The two-channel 
quadrature feedback signals from the optical encoders are 
decoded within the gate array to clock two 8 -bit relative 
position counters, one for each axis. The position counters 
are cleared on each read by the microprocessor, in essence 
providing velocity feedback to the microprocessor, The mi- 
croprocessor integrates ihis feedback to generate position 
information. The power supply check circuitry provides 
the microprocessor with a 6-bit measurement of the motor 
drive supply voltage. 

In the feed-forward path, the microprocessor controls 
each motor by writing to two H-bit registers for each axis 
in I he gate array. The two registers i tmlrol the period ami 
duly cycle of the pulse-width-modulated motor drive sig- 
nals. Pulse-width-modulated motor drive circuits were 
chosen because of the ease of interfacing to digital systems 
and their efficiency advantage over linear drivers. Using 
the feedback of the motor drive supply voltage, the micro- 
processor cao adjust the period of the drive signal to regu- 
late the gain of the drive path. This eliminates the -■■ j» 
of having a regulated supply for the motor drivers. The 
microprocessor varies the duty cycle of the pulse width 
modulator as dictated by the solution of the control equa- 
tions to achieve control of the plant. 

When sampling the front-end channel at high sample 
rates, there is not sufficient processing power available 
I runi the &809 microprocessor to execute both the channel 
ami the servo routines in real time. Thus, a multiplier 1 ! 
under microprocessor control is provided to allow the gate 
array to close a position loop about the plant without mi- 
croprocessor intervention. To avoid any instability caused 
by loss of velocity information, the position loop gain is 
halved when this is done. This allows the microprocessor 
to supervise the < hannel data transfer without the overhead 
of executing the servo routines. Other miscellaneous cir- 
cuitry in the servo gale array provides pen-lift control, the 
microprocessor watchdog timer, the front-end channel 



communications serializes and a chip test 

The real-time servo routines are initiated by a nonmask- 
able interrupt, which is run at a 1-kHz rate while plotting. 
Aside from various housekeeping duties, the main respon- 
sibilities of the servo routine are to maintain control of the 
plant by closing the feedback loop, and to generate the 
reference inputs to drive the system. 

Closing the feedback loop is always done in the same 
manner while plotting either vectors or data directly from 
the front-end channels- The relative position register is 
read and summed with the old plant position to generate 
the updated plant position. A copy of the relative position 
register value is multiplied by the velocity feedback con- 
stant to generate the velocity feedback term, The plant po- 
sition is subtracted from the reference input to generate 
the position error. From this, the velocity feedback term is 
subtracted and a deadband compensation term is added to 
generate the control value to be sent to the pulse width 
modulator. The power supply check register is read and 
the period of the pulse width modulator is adjusted to 
ensure a constant gain for the motor drive block. 

Plotting Data 

There are three separate reference generators that can be 
invoked, depending on the mode of plotting. The first is 
for direct recording of the front-end channel data, the sec- 
ond is used when plotting vectors parsed from the I/O bus 
(HP-IB], and the third is used when plotting from the HI* 
7090 A T s internal data buffer. When directly rut .on ling front- 
end channel data, the inputs are continuously sampled at 
250 Hz and the internally generated time base is updated 
at the same rate. The samples are scaled according to the 
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prevailing setup conditions to provide the new desired 
position for the pen and paper axes, Were these inputs fed 
directly to the servos, high-frequency or noisy input signals 
could easily result in motor overheating, The new desired 
positions are therefore passed to the servos through a refer- 
ence profiler, which limits plant acceleration to 2g and 
maximum slewing speed to 50 inches per second This 
limits the power input to the motors to a safe operating 
level and preserves acceptable writing quality. This ap- 
proach results in no overshoot when recording step inputs 
iiinl provides good reproduction of l~em peak-to- peak 
sinusoidal waves for frequencies below 10 Hz. 

When the HP 7090A operates as a plotter, HP-GL* com- 
mands received over its HP-IB interface are parsed in accor- 
dance with the current graphics environment to generate 
new desired pen and paper locations, these new locations 
are represented as two-dimensional vectors relative to the 
present location. These vectors are passed to a vector refer- 
ence generator via a circular queue capable of storing up 
to 30 vectors. The vector reference generator takes vectors 
from the queue and profiles the input to the servos to con- 
strain the plant to a constant 2g acceleration and 75-cm/s 
maximum vector velocity. Fig. A depicts the profiling of 
two consecutive vectors. The second vector is long enough 
for the pen to reach maximum velocity and then to slew at 
this velocity for some time before the onset of deceleration. 

A short pause of 12 milliseconds between vectors ensures 
settling of the plant at the vector endpoints. The references 
for the paper and pen axes are simply scaled from the 
vector profile by the cosine and sine, respectively, of the 
angle between the vector and the positive paper axis. 

Vector Profiler 

Plotting from the internal data buffer could be performed 
in exactly the same manner as plotting vectors from the 
HP- IB interface. However, several attributes of this mode 
of plotting led to the development of a new reference 
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generator. The first is that for each channel to be plotted, 
a string of 1000 vectors is already stored in the internal 
data buffer, Thus, the overhead of running the HP-IB inter- 
rupt routines, the parser, character generator, and other 
routines tP create vectors is eliminated. Second, since Ihe 
in in lions to be plotted are continuous, the lOOO data points 
form a contiguous string of short vectors (typically less 
than 0.025 inch), all plotted pen down. Furthermore, the 
angle formed between any two consecutive vectors is typ- 
ically very shallow. 

Consider the trivial case of plotting a dc signal from the 
internal data buffer Assuming a In-inch trace on B-size 
paper, this amounts to plotting 1000 vectors, each of length 
0.015 inch, all along a straight line. Using the HP-IB vector 
reference generator would require 10 ms to profile the ac- 
celeration and deceleration of each vector, plus a 12-ms 
intervector delay. Thus, it would require 22 seconds to 
draw this 15-inch line, whereas if it were plotted as a single 
vector at 75 cm/s. it would require just over 0,5 second. 
Therefore, a new vector profiler was designed for p lotting 
from the internal data buffer with the objective of improv- 
ing throughput. This algorithm does not require a stop at 
each vector end point. Rather, it constrains the vector end- 
point velocity SO that the following three conditions are 
met: 
m The angle drawn at the vector end point is drawn with 

negligible error, 

■ The vector is not drawn in less than eight iterations of 
the servo interrupt routines (i.e.. 8 ms|. 

■ A 2g deceleration to a full stop at the end of the vector 
string is achievable. 

Using this internal data buffer reference profiler, a 15- 
inrh do signal trace is plotted in 8 seconds, because of the 
second constraint. This is nearly a factor of three in 
throoghput improvement compared to using the HP-IB vec- 
tor reference generator, in fact, many functions are piottable 
in the 8-second minimum time with this technique, result- 
ing in throughput gains as high as eight. 

Why not apply the same profiling technique to vectors 
received over the HP-IB interface? The answer is twofold. 
First, vectors plotted from the bus are generally not contigu- 
ous strings representing continuous functions. They typi- 
cally have many pen up/down cycles, form acute angles, 
and are longer, all of which reduce the throughput gain 
using this algorithm. Second, applying the three conditions 
to determine the vector endpoint velocity requires addi- 
tional processing of each vector to check angles and deter- 
mine the distance to the end of the current string of vectors. 
To do this in real lime requires that, as each new vector is 
received, the processor backtrack through the list of current 
un plotted vectors to see if their endpoint velocities can be 
increased. When the nature of the plot is such that little 
throughput gain is possible horn the application of these 
algorithms, the additional processing load of executing 
them can actually result in a throughput loss. Therefore, 
this approach is restricted to plotting of the internal data 
buffers where the throughput gains are the greatest. 



Fig. 3. Profiling of two typical vectors parsed from the HP-IB, 
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Measurement Graphics Software 



by Francis E. Bockman and Emil Maghakian 



I P 1 7090A MGS IS A SOFTWARE PACKAGE written 
HH for the HP 7090A Measurement Plotting System that 

1 runs on HP's Series 200 Computers, MCS allows the 
user to: 

■ Set up measurements 

■ Take measurements 

■ Store and retrieve measurement data to and from disc 
files 

■ Annotate measurements with text; axes- and simple 
graphics 

■ Manipulate measured data 

■ Provide soft and hard copy of measured and manipulated 
data, 

MGS was written to provide a system solution to some 
of the general problems of measurement recording and data 
acquisition. It is designed to be used by scientists and en- 
gineers not wanting to write their own software. This soft- 
ware package extends the capabilities of the stand-alone 
HP 7090A. 

The package consists of two discs. The first disc contains; 
the core of the system, the initialization routines, the library 
routines, and the memory manager. The second disc con- 
tains six code modules, one for each functional subsystem. 
The measurement setup module contains code to help the 
user specify the setup parameters relevant In the measure- 
ment The measurement module allows one to start the 
measurement and the flow of measurement data into the 
computer, The storage-retrieval module contains code to 
store and retrieve measurement data and setup information 
to and from disc memory. The data manipulation module 
implements the ability to manipulate measurement data 
mathematically. The annotation module adds the capabil- 
ity of adding graphical documentation to the measurement 
record. The display module allows a user to display the 
measurement data taken and the annotation on either the 
computer's display screen or on paper. 



System/Subtysteni Name 




llnsagvUn* 



Input Une 



Since MGS is intended for the instrument scientific mar- 
ket where us- their own instrument con- 
trol software, we used BASIC as the application language. 
Hence, users of the package can add their own code in a 
commonly understood language to tailor it to their specific 
needs. The application is distributed in source form. 

Human Interface 

The human interface is designed for both novice and 
expert users. We have made the assumption that all our 
users are familiar with X-Y recording, and that they have 
used recorders for data measurement, 

A human interface should be self explanatory and de- 
scriptive to accommodate a novice user. An expert user, 
on the other hand, requires an interface that is like a tool — 
one that does not hamper creativity and does not ask a lot 
of questions (conversational), 

MGS's human interface Is an extension of the HP 709GA f s 
human interface. There are no operational conflicts be- 
tween the HP 7090A and MGS, 

Screen layout is an important part of every human inter- 
face. We have made a special effort to ensure a consistent 
screen layout (Fig. 1] throughout the modules to improve 
the feedback to the user. Fig. 2 is an example of an actual 
CRT display for MCS, The definitions for the various ele- 
ments of the screen layout are: 

1) Subsystem Name. This is the name of the subsystem. 
Each box on the design tree (Fig, 3) is ;i subsystem. For 
instance, the DISPLAY functional area is composed of the 
CHANGE SETUP, SCREEN, and PLOTTER subsystems. The 
CHANGE SETUP subsystem also has under it thn CHANGE 
SCALE subsystem (not shown). 

2) Arrow. The user can only change one parameter setting 
at a time. The arrow points to the parameter thai is currently 
modifiable, The arrow is controlled by the softkeys UP |k0] 
and DOWN (k5), 

3) Parameter Name Area. This area of the CRT is where 
the parameter names are displayed. 

4) Current Parameter Selling Area, The current parameter 
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Fig. 1. Screen layout for MGS, 



Fig. % MGS control disptay. 
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setting is displayed on the same line as the parameter name, 
The parameter setting is enclosed by angle brackets. For 
example: 



paraml 
>param2 



param N 



< paraml setting-* 
<param2 setting > 



< param N setting > 



where parameter 2 is currently selected to be modified. 

5) Help Area, This area of the CRT is used to display help 
information to the user, which will consist of either the 
current valid range of parameter settings for the parameter 
designated by the arrow, or information on how to set the 
parameter, or what to set the parameter to, 

6) Message Line. This line is used by the software to dis- 
play messages to the user. When applicable, it will specify 
the permissible range of parameter settings 

7) Input Line, This line is used for entering text and num- 
bers when required by MGS. 

8) CRT Softkey Labels. This ;irea displays the labels for 
the HP 9000 Series 200 Computer's soft keys. The labels 
shown in Fig. 1 do the following actions when the corre- 
sponding softkeys are pressed: 

UP (k0) i Maces the arrow up one parameter. 

DOWN (k5) : Places the arrow down one parameter, 

DEFAULT (k3) : Sets the current menu parameters to 

their default settings. 

help (k4j : This softkey has an on/off toggle action 

An asterisk in thesoftkey label implies 
the help information will be dis- 
played in the help area on the CRT, 
for the current menu and all the fol- 
lowing menus, This softkey maybe 
toggled on and off as many times as 



necessary. 
EXIT |k9] ; Returns the user up one level of the 

t ree to the previous subsystem . 

The primary user input to the software is the knob and 
the softkeys on the keyboard of the Series 200 Computer. 
Input from the keyboard has been limited as much as pos- 
sible. The softkeys provide the user with the ability to 
control the flow through the design tree (Fig. 3). 

The knob controls the setting of the parameter selected 
by the arrow on the menu. To set any parameter, the knob 
must be rotated first. The software will then react in one 
of the ways listed in Table I. 

Table I 

Para mete r Type Soft wa re Reaction 

Enumerated Turning the knob will scroll through the current 

I i > . . specific list val id parameter settings for the specified 
of settings ) parameter. 

Positional Turning the knob wilt move the graph ics cursor 

in a left or right direction. Turning the knob with 
i he SHIFT key held down will move the graphics 
cursor in an up or down direction. 

Number with Turning the knob will cause the parameter set- 
limited range ting to be incremented or decremented by a 

smal 1 a moil nt , Turning the knob with the SHIFT 
key held dow r n will cause the parameter setting 
to be incremented or decremented by a large 
amount. 

Text or number Turning the knob will cause a message 
with unlimited to be displayed ptl the message line and the cur- 
range mil setting to be displayed on the input line. 

Then the user may modify this setting by typing 
in the ne w setti ng and pressing the ENTER key 
when correct. 



Configuration 
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Fig , 3. Conceptual layo ut of MQ S 
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The major philosophy in this human interface is "mod- 
ification, not specification." This means that at all times 
the system settings are valid, and user can change one valid 
setting to another. The user is not burdened by descriptions 
or questions, The heip area describes the current possible 
settings. It is placed to one side intentionally so it does not 
interfere with the man-machine interface. It can be turned 
on and off at the user's discretion. 

The design of the human interface limits the number of 
error states. The user can only see an error message when 
entering a number from the keyboard or usi n^ the HP 7090A 
to enter a voltage. We have managed to achieve this goal 
by updating the next possible state lists every time a param- 
eter is modified. 

Overall Design 

There is a menu associated with every mode of the design 
tree (Fig, 3). The tree is three levels deep from the main 
level. The main level consists of the menu that allows 
access to the six major functional modules: measurement 
setup, measurement, display, annotation, storage/retrieval, 
and data manipulation, The softJteys are labeled according 
to their function; pressing a softkey will place the appro- 
priate menu on the CRT. The general rule is that a user 
exits a menu to the same menufs] the user went through 
to enter the menu Pressing the EXIT softkev returns the 
user up one level of the tree. The configuration level is a 
one-time process and is only entered at the start of the 
program, Pressing the EXIT softkey a I the main level will 
stop the program after verifying that the user really wants 
to exit. 

Core Library and Swapper 

The software package consists of a core or kernel that 
must always reside in memory There is additional m«1h 
fur Initialization and configuration that is loaded initially 
and then removed from memory after running. The six 
main code modules that implement the functionality of 
the system can be either resident in memory or loaded from 
disc, depending on the system configuration and available 
memory, There is also a library of utility routines thai 
sides in memory with the kernel. The library contains code 
to handle the sf reen menus and data structures. Also, the 



code that communicates with the HP 7090A for data trans- 
mission resides in the library. 

A part of the system known as the swapper* or memory 
manager, is responsible for ensuring that there is enough 
memory available for requested operations. At program ini- 
tialization time, the swapper bads in the whole system if 
there is enough memory': if not. it loads just the main section 
of the system and the supporting libraries. Provided enough 
memory exists for the former action to lake place, the swap- 
per will not need to take further action. Assuming there is 
insufficient memory to load the complete system » the swap- 
per will take actions when memory allocation is needed. 
The swapper handles all requests to enter a subsystem from 
the main menu, It first checks to see if the subsystem is in 
memory. If it is, no action is taken by the swapper and the 
subsystem is entered. If the subsystem is not in memory, 
the swapper checks to see if enough memory is available 
to load it in. If so, it is loaded and entered. Otherwise, 
space in memory will be made available by removing other 
subsystems not needed. 

Data Structures for Menus 

As mentioned earlier, all the menus in MGS are consis- 
tent. There is a single data structure that contains all the 
data for a screen. The diagram in Fig. 4 gives a graphical 
representation of the logical structure and Table II defines 
the elements shown in Fig. 4. 

MGS prevents the user from entering error states. This 
task is done by changing o_strt and o_cnt entries for a given 
attribute. All the valid entries for attribute p are always 
between o_strt(p) and o_strt(p) +o_cnt(p). 

This data structure is built using one and two-dimen- 
sional arrays in BASIC. There are several copies of this 
structure, one for each screen layout. The data definition 
portion of MGS would have been much smaller and storage 
more efficient if BASIC had dynamic storage allocation 
capability like Pascal. 

Data Structure for the Knob 

MGS relies heavily on the knob of the Series 200 Comput- 
ers for input. At times the knob is used for entering a 
numeric value, such as volts at full scale, total time, etc. 
Tn make the knob more useful we had to make it nonlinear. 





Fig. 4. Graphical representation 
of MGS data structure for a screen 
display 
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Table II 



Element 

nameS 

name—Dos 



attr$ 
attr-pos 

pnlr-pos 

options 

o-crrt 
o-md 



o-strt 

o-pos 

o— max 

title 

pointer 



a-cnt 
ald-pnlr 



Definition 

Holds parameter na mes 

Holds the encoded x-y position of the names on 

the screen. There is out entry for every nameS 

entry, Instead of using two integer arrays For 

keeping the x and y positions, the fol 3 owing 

eiti oding sdherne is used : 

name-pos(t) -*(i)*51 2- y fi] . 

Th is is done to conserve storage space. 
Holds the number of entries in nameSand name-pos 

columns. 
Holds the current parameter setting. 
Holds the x-y position of where parameters are to 

be displayed. 
Hi» Ids the x-y position of whent the pointer is to be 

displayed for each parameter. 
A two-dimensional structure. Each row of this 

structure holds ail the possible settings for the 

corresponding parameter, 
Holds the count of valid entries per row in the 

option \A>U>. 
Holds the current in dux erf the item from the 

option table that is being displayed, that is, 

attrS( i ) = option$(J , cundf i } ) . 
Holds the logical first entry in options. 
Holds the x-y position of where options are to be 

displayed, 
Maximum number of options in the option table. 
Holds a string that contains a screen name. 
Points to the current raw in the option table. The 

UP and DOWN softkeys change the value of this 

variable. 
Holds the number of parameters in the data 

structure. 
Holds last value of the pointer. 



This means the step size of the knob is dependent on the 
current value of the knob, For example, when the current 
value for volts at full scale is between and IV, the incre- 
ment is 0.05V, and when the current value is between 50 
and lOOV, the increment is IV. 

To make this task uniform throughout MGS the data 
structure outlined in Fig. 5 is used. 

Each table contains several rows of data, Each row is for 
a given range. Table III defines the parameters, 



Table III 



If cjcurr upper_bound(c_index) then cjndex - cjndex+ 1 and clcuit 

= lo we r_bou nd ( c_rn dex ) 
ft c_curr - lower_bound(c_index) then c_index -cjndex- 1 and c_curr 

= upper_bound (cJndex) 

Every time the value of c_index is changed, the following 
condition must be checked: 

If ejndex - c_ hi in d then c Jndex = cjowjnd 
If cjndex < cJowJndthencjndex = c_hiJnd 

There is a copy of this data structure for every numeric 
parameter. Again, this is because of the limitations of 
BASIC, 

Measurement Setup Module 

In this module, the user sets up an experiment and 
specifies dependent channels, independent channels, trig- 
gering mode, duration of experiment, type of experiment, 
etc, Accessible through this module are channel setup mod- 
ules, hi those modules the user sets range, offset, and trigger 
level and width for each channel, If the measurement is to 
be conducted in user units, the user specifies the minimum 
and maximum user units, instead of range and offset, 

Up to now. most users of X-Y recorders had to convert 
their units to voltage levels, and then take a measurement 
in volts. Finally, they had to convert volts back to their 
units. This is also the case with the stand-alone HP 7090A. 

MGS allows the user to set up an experiment in volts. 
This is provided for the sake of consistency with the stand- 
alone machine, In addition to volts. MGS gives the user 
the capability of setting up and taking a measurement in 
some other unit system: displacement, acceleration, force, 
saturation, etc. To set up a measurement in. volts, the user 
specifies range and offset sellings for each channel and 
trigger information for Channel 1, just as for the stand-alone 
HP 7090A. 

When in user units, a measurement is set up by specify i rig 
the minimum and maximum possible readings in user units 
for each channel and trigger information for Channel 1. 
Trigger information is specified in user units. We believe 
that the availability of user units enhances the usefulness 
of MGS. For example, in measuring temperature in a chem- 
ical experiment, we can set user units limits for Channel 
1 to -100 5 C and IQCfC and set the trigger Level to 10°C. 



Element 



Definition 



increment Holds the value by which the current setting will 

be incremented. 
lower_bound Holds the minimum limit of ihn range, 

l pperjxsund H 1 1 1 d s I h e m aximum lim it ai th e ra nge, 

cjowjnd Holds the first legal row of the table. 

c_hi_jnd Hoi ds the last legal row of the table, 

c_ index Points to the current row in I he table, 

c_cuit Holds the current value. This is the variable that is 

being incremented and decremented. 

cJow_ind and c_hi_ind are used to control the legal limits 
of the knob. Valid limits are kept between the high and 
low indexes. 

The following conditions are used for moving up and 
down in the table: 



Increment 


1 lower bound 


2 upper bound 1 


« 

























Fig, 5, Oafs structure for knob control 
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Circuit Turn—On Characteristics 

225 




T i me C se c . ) 



Fig. 6, Example MGS plot show- 
ing a calculated parameter (power) 
versus measured current and 
voltage 



Measurement Module 

The measurement subsystem implements the ability to 
take measurements. It starts the measurement, and v 
data becomes available, receives the data ami sir ire* i! in 
I he software's channel buffers. There are three types of 
measurements: direct on-screen, direct on-paper, and data 
streaming, In direct on-screen measurements, the data is 
plotted to the screen in real time as the data is being stored 
in the software's channel buffers Direct on-paper measure- 
merits emulate a traditional X-Y recorder and no data is 
sent lo the computer. Data streaming mode allows up to 
ten thousand samples from each of three channels to be 
buffered inin memory and then written out la a disc file 
for later processing, 

Display Module 

The display subsystem allows measurements and anno- 
tation to be displayed on the screen or on paper. There is 
a display setup mode that allows the user to specify which 
data channels of the measurement mil be displayed The 
display scale and the size of the displayed measurement 
can be adjusted. The output to paper can lie formatted so 
that up to four measurements and 1 h < ■ i v data can be plotted 
on one page. 

Data Manipulation Module 

In a measure men I system the user may have a need to 
post process the recorded measurement. This module 
I he user the capability of performing arithmetic operations 
on data channels. This subsystem has the capability oi 
performing + , - . X, ^.square root, square, log, and nega- 
tion, This subsystem gives the user tin* i apahilih of build- 

iiit; algebraic equations with two operands and one 
Uor. Operands c an be any data I lumuel or constants 
or the result of the previous operation. The results can be 
displayed using I lie display module The las! five opera- 
tions are shown in a small window. This isdnne to simplify 



the task of computing complex equations through the 
chaining of operations For example, when measuring volt- 
age and current, the subsystem can be used to compute 
power by multiplying the voltage and <.unent readings afl 
shown in Fig. 6. Manipulations not provided directly by 
the software can be applied to the data sets through user- 
written programs. 

Storage and Retrieval Module 

The storage and retrieval subsystem allows the user to 
in a onlv the measurement data but also the current 
measurement setup, annotation, and display setup param- 
eters- When retrieving data, the users can select subsets of 
the data to be retrieved. For in uinotation can 

he stored from one measurement and retrieved into another. 
The measurement setup para meters will always be re- 
trieved along with the measurement data because the data 
iK.II does not have meaning without its setup i onditions. 
There is a file header at the beginning that contains infor- 
mation about where the data and setup parmeters are lo- 
1 in the file. 

Annotation Module 

The annotation subsystem gives the measurement 
graphics user the capability to put grids, axes, labels, lines, 
markers, and an annotation box on the measurement graph. 
It is riol intended to do general-purpose drawing or to be 
b graphics editor. Some features are: 

■ Axes and grids feature automatic tic labeling in the units 
of the measurement. Lou axes and grids arealso available, 

■ Labels ate useful ha- titles and for adding documentation 
to the graph. They can be added, changed, or deleted at will. 

■ Lines can be used for simple drawing, 

■ Markers annotate points on the data line and they can 
be automatically labeled with their respective x and y 
coordinates. The cursor can he used to step through 
points fin the data line hi position the marker. 
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■ The annotation box can be used to supply information 
about the measurement, such as channel settings, trigger 
level, trigger time, and time of day. 

A c k n owl edg me n t s 

In addition to the authors, Diane Fisher and Irene Ortiz 
contributed to the design and development of the MGS 
package. Larry He noes see managed the project. 



Analog Channel for a Low-Frequency 
Waveform Recorder 



by Jorge Sanchez 



THE ANALOG CHANNEL of the HP 7090A Measure- 
ment Plotting System conditions and digitizes the 
signals connected to the inputs of I he instrument. 
The analog signals are amplified, filtered, and digitized by 
a series of stages as shown in Fig. 1, After the signals are 
digitized, the equivalent binary words are processed 
through a series of calibration procedures performed by 
the microprocessor to provide the full dc accuracy of the 
machine, The architecture of the channel is designed With 
flexibility of operation as a goal Thus, the microprocessor 
is used to set up the multiple stages for coarse and fine 
gains and offsets. This allows the execution of zeroing and 
calibration routines and eliminates manual adjustments in 
the manufacturing process, (No potentiometers were used 
in I he design. See box on page 22.) The analog channel has 
floating, guarded inputs. Through the use of isolation and 
shielding, common mode rejections of >140 dB for dc and 



>UH> dB for m Hz are obtained. 

Preamplifier 

The analog channel preamplifier (Fig. 2] uses a set of 
low-noise, low-leakage JFETs and low- 1 hernial -EM F relays 
to switch the inputs of amplifier Al to the gain and attenu- 
ation string of resistors. The amplifier switches are con- 
ned ed in such a way as to set the 14 major ranges for the 
HP 7090A. (Other ranges are provided by a postamplifier 
as will be explained later.) The ranges are set by the micro- 
processor's loading the appropriate words in Iron I -end re- 
gisters 1 and 2. Amplifier A2 is used as a buffer to drive 
three different circuits: 

■ The internal guards that minimize printed circuit board 
leakage in critical areas 

■ The on/off and biasing circuits for the range setting 
switches [as set by front-end registers 1 and 2) 
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Fig. 1 . Block diagram of analog 
channel. 
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(To JFETS and Relay*) 



FE Register t 
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Fig. 2. Analog channel preampli- 
fier. 



■ The input protection feedback loop. 

To satisfy the performance requirements of the HP 7090A 
to handle signals as low as a few microvolts with a 
bandwidth spanning from dc to a few kilohcrtz, the design 
uses carefully chosen components such as the precision 
low-noise amplifiers Al and A2 and metal-film resistors 
of smalt values (to avoid white noise}. In addition, printed 
circuit board layout becomes critical. Hence, extensive use 
of guarding and shielding of critical areas is done, including 
the use of Teflon rM cups for the input node. 

Transistor Q3 is part of the circuitry used in <m autuzero- 
ing routine to eliminate channel offsels caused by initial 
component errors, temperature drift, and aging. 

ESD and Overload Protection 

Front-end inputs are likely to experience ESD (electro- 
static discharge) transients since they can be touched by 



the user. Also, in a general-purpose instrument, temporary 
dc overloads may be applied. For this reason, protection 
circuits are necessary. Very often these circuits tend to 
degrade amplifier performance, This situation was avoided 
in the HP 7090 A hy using the circuit shown in Fig, 3. 

If there is no way to prevent ESD from penetrating the 
machine, the next best thing is to shunt the transient to 
ground through a preferential path of impedance lower 
than the rest of the circuits. The primary ESD clamp is 
actuated by electron tube El and the source inductance. 
El has a very large resistance and low capacitance when 
in the off state. Hence, it does not degrade the amplifier's 
input impedance. Capacitor Cl turns off IM after the sur»e. 
Resistor Rl discharges CI- This circuit can only limit VI 
to several hundred volts because of the insufficient speed 
of El, 

The secondary protection devices clamp the input to a 



High O 



Low O 



Guard O 




Fig. 3. Input protection Circuitry. 
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voltage less than the maximum allowable voltage for ,\ l 
(This is also used as the dc overload protect ion.) The other 
circuits minimize leakages. Buffer A2 sets rectifiers CR1 
and CR2 to zero bias by feedback. Leakage caused by Zener 
diodes VRl and VR2 is provided by A2 and not by the 
minus node of Al. 

To avoid sourcing current for the bottom plate of Cl from 
the common plane, and since there is no way to obtain a 
simultaneous turn-on of El and E2, C2 is installed between 
the low and guard terminals to provide the current. 

RVl is a voltage clamp device used to protect the devices 
between the low and guard terminals gainst overloads 
between I he input terminals or against transients applied 
io l he low terminal. The final shunting of the ESD transient 
to earth ground is provided by electron tube E2* 

hi a circuit such as this, care must be taken to shield or 
orient the components and connections to prevent reradi- 
ation of noise to other areas, tn addition, the breakdown 
voltages of interconnections should be much higher than 
the breakdown voltages of the devices used, These protec- 
tion circuits proved successful during testing by enduring 
many thousands of electrostatic discharges up to 25 kV 
that were applied to I he inputs. 

Vernier Gain Stage 

The digitally programmable vernier stage consists of a 
12-bit multiplying digilal-to-analog converter (DAC) and 
an operational amplifier. Its main function, in conjunction 
with the preamplifier, is to provide the numerous cali- 
brated ranges of the machine. The gain in this stage is 
represented by G = - D/4096, where D is the decimal equi- 
valent of the binary word that is applied to the DAC, The 
number I) is equal to the product of two scaling factors Dl 
and D2. Dl accounts for the vernier gain, tt is derived from 
the range entered by the user and from internal routines 



Change gain 



Open input attenuator and 
close preamplifier zero switch 



Load the offset DAC with count 
from approximate formula 



Do A4o-0 conversions, and by 
iteration on the loaded word in 
the offset DAC, get the offset as 
close to zero volts as possible 



in the microprocessor as indicated by the channel's range 
calibration equations. D2 is a fixed attenuation factor and 
is used as a coarse gain adjustment to account for system 
gain error caused by component tolerances. 

Postamptifier 
The postamplifier stage has the following functions: 

■ It amplifies the signal to a voltage level that is suitable 
for the digitizer 

■ It contains a 3-kHz low-pass active filter 

■ It provides an offset voltage that is programmable by the 
microprocessor. 

The programmable offset is accomplished by the use of a 
low-cost DAC, This converter is used primarily for subtract- 
ing out the analog channel's subsystem offset each time 
the ranges are changed, and for periodically performing a 
zero calibration to account for drifts. The offset DAC per- 
forms a coarse offset subtraction in hardware. To ac- 
complish a fine offset calibration, the residual offset V 05 is 
first found by the offset calibration routine (see Fig, 4), 
This offset is subtracted from ihe incoming data during Ihe 
data correction routine, which is executed after the input 
signal is sampled. 

A-to-D Conversion Circuits 

This section consists of one sampling stage with two 
sample-and-hold devices connected in parallel and requir- 
ing an analog multiplexer, buffer and control logic, and a 
12-bit analog-to-digital converter (ADC). Two sample-and- 
hold ICs are used here to be able to perform an A-to-D 
conversion on a sample while simultaneously acquiring 
the next sample [see Fig- 1 on page 22). After the conversion 
is completed, the sample-and-hold stages are swapped by 
the sequencing circuits and the cycle is restarted. This 
eliminates the acquisition time wait for a conversion cycle, 
thereby allowing the use of a slower low-cost converter. 

Studies have show r n that the eye can distinguish very 
small fluctuations in a ramp waveform when it is plotted. 
For this reason, a 12-bit-resolution ADC had to be used, 
since the HP 7090A can plot the digitized waveform. 

Common Mode Rejection Ratio (CMRR) 

The CMRR specifications of the HP 7090A demand a 
high degree of isolation between the analog channel and 
ground. This requires resistances on the order of gigohms 
and a maximum capacitance to ground of about 25 
picofarads, There are two main areas that provide the iso- 
lation — the optical interface and the channel power supply. 



Take results Of last A-to-D conversion 
and compute the residual channel offset 



Store V M in RAM 



Open zero switch and restore 

the appropiate attenuator 

for the range 




Fig. 4. Flowchart of offset calibration routine. 



Fig. 5. Simplified error model for HP 7090A front end. 
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lG-iG)V, - 







Line 2; y = m 7 x 



Line 1: y = m,i 



Fig. 6. 7b calibrate gam. the response represented by fine 2 
m mapped into the tdeai response indicated by tit 

The optical isolators provide all of the digital communi- 
cations with the system processor in serial form. This is 
done with a small degradation in capacitance and r- 
tanre to ground. In addition, internal shields in the op- 
tocou piers provide common mode transient rejection of at 
least 1000 V/pts, 

The most critical component in the channel power sup- 
ply is the isolation transformer. To obtain a low isolation 
capacitance, three box shields are used. It can be dem- 
onstrated that three box shields will eliminate the most 
common ground loops associated with a floating front end. 1 
Box shields and special manufacturing techniques minimize 
and cancel currents induced by the transformer into the 
preamplifier circuits. With this and careful pin assignments, 
low coupling capacitances in the hundreds of Femtofarads 
are obtained. 

The analog channel printed circuit board is placed in a 
sheet-metal shield to decrease coupling capacitances to 
ground and to minimize external source interference into 
the sensitive amplifiers. 

Modern analog front ends often include digital and 
analog signals in the same set of circuits. This can become 
troublesome when there is a need to handle microvolt-level 
signals al high accuracy and wide band widths, Detailed 
attention to printed circuit board layout makes it possible 
to obtain high-tjuaHly signal conditioning. For this pur- 
pose* isolation of internal grounds and of analog and digital 
signals was done. Ground planes are also used to minimize 
tntersignal capacitances. In addition, well-known tech- 
niques 2 are used throughout the board for isolating power 
supply output impedances and ground returns from the 
different stages. 

Computer Calibration 

To preserve accuracy under different temperature condi- 
tions and to compensate for the aging of components, the 
HP 7G9GA5 micropron cutes a series of calibration 

routines- These same routines allow the use of automated 
gain calibration at the factory. The calibration factors thus 
obtained are stored in a nonvolatile memory in the HP 
709GA. 

Every stage in the front end adds errors to the signal. 
The procedure followed is to lump all errors, refer them 
to the inputs, and separate them into gain and offset errors. 



Fig. 5 shows a simplified example of an error model. In 
this case G = ideal gain, V, = input signal, ,iG = gain error, 
Y .= signal at the ADC. V m = offset error, and [VJ = quan- 
tized value o: 

calibrate the sampled signal, we first sample the sys- 
tem offset by closing S2 and opening SI. This is done in 
the HP 7090A during the offset calibration routine outlined 
in Fig. 4. This yields: 

AG)V M 

Th«:n. we acquire the input signal by opening S2 and 
closing SI. which gives: 

V aj = CV S - AGV, + GV US - AG'. 

After offset compensation we get: 

= V rlj - V Di = C\\ + AG\\ 

To do a gain calibration, we map response line 2 in Fig. 
6 into line 1 by the procedure explained in the box on page 
22. This yields the gain calibration factor G/(G + AG). This 
factor is obtained for each one of the 14 major ranges of 
the machine. *As mentioned before, these factors are stored 
in the HP 7090A T s intrni.il nonvolatile memory. 

Accuracy in other ranges that use the vernier is guaran- 
teed by the circuit design. 

The gain calibration requires a final multiplication: 

% 4 = V 0( [G/{G + AG)) = [V,(G + AC)][G/[G + AG}] - GV, 

This last quantity is indeed the amplified input voltage, 
which is the desired quantity. 

Other more complex models, similar to the one above, 
are used to account for other operations of the machine 
such as user's entered offset, factory calibration routines, 
and combinations of interacting errors. The exact equations 
used fnr the corrections in firmware are also in a quantized 
form, 
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Usability Testing: A Valuable Tool for PC 
Design 

by Daniel B. Harrington 

Evaluating the experiences of users unfamiliar with a new 
computer product can provide valuable guidance to the 
designer and the documentation preparer. 



A KEY ELEMENT IN THE DESIGN of a personal com- 
puter is how easy it is for a new owner to set il up, 
get it running, and do basic tasks such as printing 
output, loading software, entering data, and handling files. 
To evaluate these qualities, HP's Portable Computer Divi- 
sion has conducted three usability tests, two on I he Integral 
PC (one before, one after Introduction) and one on The 
Portable (after introduction). A single test program uses 
ten reviewers, one per day, each performing for pay the 
same set of tasks on the selected computer model. The 
tasks are performed in the testing room at the division. 

The reviewers are selected to meet th* j profile of I he 
expected buyer of the computer. Each reviewer's experi- 
ence is videotaped, and an observer in the test room con- 
stantly monitors the reviewer's progress (see Fig. 1). When 
a reviewer becomes frustrated enough to call the dealer for 
help, the observer acts as the dealer and offers the help 
requested, Product engineers and management are invited 
to observe the test sessions. The results of the test, including 
suggestions for product improvement, are widely distrib- 
uted. Finally, a reviewer debriefing meeting is held where 
the reviewers and HP engineers can discuss the usability 
of the product. 

Why Have Usability Testing? 

Hewlett-Packard is commit led to quality and customer 
satisfaction. To know if we're satisfying our customers, we 
must measure our performance. Usability testing provides 
one means of measuring product quality and customer 
Satisfaction, This method has several advantages; 

■ Product engineers can observe users (the reviewers) 
using their products, both during product development 
and after market introduction. Tests conducted during 
product development allow changes in the design of the 
product to satisfy the observed needs of users, 

■ It's a controlled measurement allowing statistical evalu- 
ation and comparisons of user satisfaction before and 
after product changes are made. 

■ Product engineers can meet the group of reviewers at a 
debriefing meeting. At this meeting, engineers can hear 
what the reviewers liked and did not like about the prod- 
uct, and the product changes they wish HP would make. 
This meeting also allows dialog between engineers and 
reviewers, 

■ It's an especially effective test of documentation, a key 
part of this type of product. 



Many of our competitors emphasize the human interface. 
They understand that buying decisions are affected both 
by the reported length of lime it takes new users to get 
familiar with a computer and the difficulties users have 
encountered in using it. Corporate buying decisions are 
especially influenced by the computer productivity ex- 
pected from a particular brand or model, 

Magazine evaluations also focus on user-friendliness. 
Perhaps you've read, as we have, magazine reviews of new 
computers, in which the writers take great pleasure in de- 
scribing their frustrations in trying to use the computers. 
Such negative reviews must hurt sales, jus I as positive 
reviews must help sales. 

Customers do not like to be frustrated by incomprehen- 
sible error messages, manual jargon, confusing instruc- 
tions, peripherals that won't work when connected, and 
all the other problems that a first-time user of a personal 
Computer too oflen encounters. Usability testing offers an 
effective way to measure and reduce such prohlems. 




Fig. 1. A reviewer studies the instructions for the computer 
being tested Note the observer and monitor in background. 
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How is Usability Testing Done? 

We learn from product management the profile of the 
expected buyer of the computer we're about to test. We 
then seek people in the local community who fit that profile 
and who are not HP employees. We find most of them are 
excited about spending a day with us playing with a new 
computer. As a token of our appreciation, we pay i 
for their help. 

We encourage HP people to observe the usability test. 
We want those responsible for the product to watch and 
listen to these reviewers as they work. While it can be a 
bumbling experience to see how the results of our efforts 
somehow fail to work in the reviewer's hands as we in- 
tended them to, such experiences are vital to developing 
a product that satisfies users. 

Each reviewer spends a day using the computer in a 
simulated \vork environment. We equip the testing room 
with a table set up like a typical office de.sk, complete with 
plant and in-basket. At best, the test situation In which the 
reviewers find themselves is stilt foreign, but we try to 
create an atmosphere that is at least partially Familiar. We 
feel the closer the testing environment is to a typical user's 
workplace, the more valid our results will be, 

An opening questionnaire gives us the reviewer's com- 
puter experience and educational background Tins infor- 
mation helps us qualify each reviewer's experiences during 
the test session. This questionnaire also confirms that the 
reviewer meets the profile of the expected buyer. 

Before users operate a computer for the first time, most 
have studied the market and already know something about 
the particular computer they ha%^e chosen. Reading 
brochures and reviews, having discussions with dealers 
and other users, and watching others use the computer 
allow a user to set up and run a new computer more effi- 
ciently than one who has never seen nor heard of the prod- 
uct before opening the box. We can't completely duplicate 
this knowledge, especially for a product still under de- 
velopment, but we do give each reviewer a description of 
the product before the tesl session begins. For a released 
product* we mail a brochure and data sheet to i.ii 1 1 ■ 
a week before the test starts. 

[fir: reviewers start with the computer in its shipping 
carton. We give each of them the same set of tasks or ob- 
jectives, and ask them to perform them in any order they 
desire. 

A video and audio recording of each session is made, 
These recordings serve several purposes: 

■ They su p port the notes the observer makes at each session. 

■ They are available for study after the test is over. 

■ They provide the raw material for the summary tape 
shown at the reviewer debriefing meeting, 

We urge reviewers to comment freely. The audio portion 
of I he tape is oflen the most important, We want reviewers 
to tell US what they're doing! how Ihey feel, what they like 
and don't like about the product; in short , we want almost 
a stream-n [-consciousness narrative. 

An observer is always in the room with the reviewer. 
The observer uses notes taken during the usability test to 
write the test report. When the observer needs more opin- 
ions and information from the reviewer, the reviewer is 
asked appropriate questions during the test. 



When we started these tests, we were concerned about 
the observer sharing the test room with the reviewer. The 
standard testing arrangement used by IBM 1 consists of two 
rooms separated by a one-way mirror. The reviewer is alone 
in one room, which is identical to a typical office. The 
observers, video cameras, and other equipment are in the 
other room. We started with and still use only one room, 
but we feared the observer 's presence would inhibit the 
reviewer's actions and comments, making the results less 
valid. Therefore, we specifically asked reviewers who 
helped us with our first test if the observer's presence hurt 
the effectiveness of the test. They told us the nearness of 
the observer helped, rather than hurt the process. They felt 
they were talking to a human rather than a machine, which 
made it easier to comment freely. They also appreciated 
the reviewer's encouragement and requests for comments. 

We also emphasize that the product is on trial, that the 
reviewer cannot fail. It's important thai reviewers feel at 
ease so that their experiences are as close as possible to 
those real users would experience. However, some review- 
ers stiJl feel under some pressure to perform, and try to 
finish the tasks as fast as they can to do a good job, An 
rver can help reduce this pressure by creating an at- 
mosphere of you-can't-fail informality. This is another ad- 
vantage in having the observer share I he lest room with 
the reviewer, 

The reviewers have only two sources of help: 

■ The manuals, disc-based tutors, on-screen help mes- 
sages, and other material delivered with the product, 

■ Their dealer (the observer). 

Reviewers that reach a level of frustration that would 
produce a call lo their dealer if they were using their own 
computer in their home or office can pick up the uncon- 
nected phone on their desk. This action tells the obsei 
that a dealer call is being made. The observer then at Is as 
the dealer and gives whatever help is needed. The number 




Fig. 2. HP's integral Persona! Computed is a powerful multi- 
tasking computer system in a 25 lb transportable package. 
Designed for technical professionals, st features a built-in 
printer , display, disc drive, and HP-tB interface and the HP-UX 
operating system, HP's version of AT&T Bell Laboratories' 
UNIX™ operating system 



JANUARY 1 986 HEWLETT- PACKARD JOURNAL 37 



)Copr. 1949-1998 Hewlett-Packard Co. 



of such calls and the reasons for them can tell us a lot 
about what product features are hard to understand or not 
working well. 

A closing questionnaire asks for opinions about the prod- 
uct. In general, this questionnaire asks two types of ques- 
tions. One type asks reviewers to rank their level of agree- 
ment or disagreement with a number of positive statements 
about various features of the product, such as; 

The owner's manual is easy to understand. 

The error messages are easy to understand. 

I like the display. 

Each reviewer is asked to rank each statement from l 
(strongly agree) to 5 (strongly disagree). The other general 
type of question asks reviewers to comment on various parts 
of Ihe product, such as manuals, keyboard, display, help 
messages, etc. Often, a product feature like a manual is the 
subject of both a ranking question and an essay question. 
Another common question asks reviewers to identify the 
most difficult or I he three most difficult tasks. That ques- 
tion is followed with a ranking question something like 
this: "Considering the difficulty of the task you identified 
as the most difficult, the instructions for lhat task are as 
clear as they can be. 1 * 

The video recorder is stopped while the closing question- 
naire is completed- Then it is turned on again to record 
the closing interview- The observer chooses some closing 
topics to discuss further, generally about product areas 
reviewers felt needed irnprnvemenL These interviews often 
produce some of the best and most useful video footage. 

About tw T o weeks after the last test session, the reviewers 
and the product engineers meet together. This is a very 
useful meeting. It allows the product engineers (hardware* 
software, electronic, system, packaging, manual, quality, 
production, etc.], management, and anyone else who is 
interested to hear reviewers* opinions directly. By asking 
questions, the audience can draw out additional reviewer 
opinions and suggestions. 

The final report is widely distributed. This report describes 
the test and gives the reviewers 1 opinions and suggestions. 



anisrm and the post introduction usability test of the Inte- 
gral PC told us that they did an excellent job. The reviewers 
who judged this computer during this second test felt the 
computer did give an impression of quality and ruggedness. 

The Integral PC's on-screen tutor, a new type of instruc- 
tion product for our division , incorporated usability testing 
as a key item in its development schedule. The strong posi- 
tive acceptance of the final tutor would not have been 
possible without the user feedback given by two informal 
usability tests and a final, formal usability test conducted 
during product development. 

The Integral PC Setup Guide (Fig. 3) is another new type 
of instruction product for our division. This guide uses a 
series of pictures with very few w r ords to tell a first-time 
user how to open the computers case, connect the keyboard 
and optional mouse, and start the on-screen tutor. Other 
sections of this setup guide tell the user how to install the 
printhead cartridge for the built-in Th Inkjet printer, how 
to load Innlold paper into the printer, and h;iu £0 prepare 
the Integral PC for transporting. 

Usability testing was incorporated into the development 
schedule for this setup guide. These tests indicated the 
need for major changes in the initial guide, The post in- 
troduction usability test proved the final setup guide was 
very useful, and suggested some further improvements. 

The prein traduction usability test of the Integral PC 
suggested improvements in the packaging. The initial ship- 
ping carton design we tested included a thin, flat parts box 
inside the shipping carton. Either of the tw T o large faces of 
this parts box could be opened easily by users t but the box 
would reveal all of its contents only when one of these 
faces was opened. If the other face was opened, many of 
the smaller parts were well hidden. When the reviewers 
pulled this parts box out of the shipping carton, chance 
would dictate which large face was up when the box was 
J aid on a table, If the wrong side faced up. ihe wrong side 
was opened, and parts were lost. 

The packaging engineer observed some of the reviewers 
opening the wrong side, and had a cure specified before 



How Has Usability Testing Helped? 

During the preintroduction lest of ihe Integral PC, 11 re- 
viewers felt the initial mechanical design did not give an 
impression of quality and ruggedness. A description of this 
computer will help to explain their complaint. The Integral 
PC (Fig. 2] is a transportable computer. The bottom of the 
keyboard is the front face of the closed -up computer, and 
the carrying handle is attached to the top* which opens up 
and folds hack to release the keyboard and reveal the built- 
in flat-panel display. SVz-inch disc drive, and Think Jet 
printer. The main reviewer complaint about the apparent 
lack of ruggedness centered on the mechanism that controls 
the opening and closing action of the top cover. This mech- 
anism had been tested by engineering and had satisfied 
their tough strength specifications. However, the reviewers 
fell the looseness of the mechanism suggested weakness 
and sloppy design. 

The mechanical engineers accepted the reviewers 1 judg- 
ment that the top cover mechanism should not only be 
rugged, but should also appear rugged. They made design 
changes that largely eliminated the looseness of this mech- 




Rg, 3. Integral PC Setup Guide, a 10-page guide whose 
development depended on usability testing 
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the sequence of usability tests was over. He specified that 
the words "Open this side" appear in large letters on the 
right side, and the words, "Open other side" appear in 
large letters on the wrong side. This improved parts box 
was tested during the post introduction usability test. Dur- 
ing this lest, the reviewers proved that people often don't 
see what they look at and don't read what they see. In spite 
of the words "Open other side" printed in large letters on 
the wrong side of the parts box, several reviewers opened 
the wrong side an nd did not see or remove the 

smaller parts* including the ink cartridge for the Thinkjet 
Printer. One reviewer suggested that we design our parts 
box to open only one way- Again the packaging engineer 
responded quickly, and the Integral PC parts box now opens 
only one way. This example shows the importance of test- 
ing the cures inspired by previous tests. 

During the p rein trod uct ion test of the Integral PC, review- 
ers felt the disc drive busy light was too dim. Engineering 
responded, and the production computer now has a satis- 
fy ingly bright light to indicate disc drive activity, 

Some help screens provided by The Portable [Fig, 
and displayed by pressing a function key. do not state 
clearly how to get out of the help screen. One help si 
set, consisting of a number of screens, does not tell the 
user how to exit until the fourth screen. The software group 
in engineering listened to the reviewer's comments about 
this. The Portable PLUS, developed after The Portable, also 
US05 help screens, but the first screen of every help screen 
set clearly tells the user how to exit. 

The Portable includes a disc-based diagnostic program. 
This program whs loath d into the memory ol the first 
shipped units of The Portable, and its label was shown in 
the PAM's (Personal Application Manager's) mam *< 
at the far left. When The Portable's display was first turned 
on, the selection arrow pointed to the diagnostic program's 
label. During the usability test, several reviewers pn 
Start on the keyboard to see what would happen, This would 
start the diagnostic: program, causing much confusion, 
Again engineering listened, and they specified that this 
disc-based diagnostic program no longer be loaded Into 
The Portable before shipment, although the disc contain iim 
this program continues to he included wilh the prOdm I 

The Portable was the first computer from I his division 
to use three-ring binders for its manuals. We elected to put 
five separate manuals jiiln one binder separated by labs, 
since these five manuals fit comfortably in one binder, and 
doing so reduced produd t:osl. A second binder was used 
to contain only one manual. the Loins" 1*2-3 M User's Man- 
ual. Even though we stated clearly (we thought) on the 
second page of the first manual that five separate manuals 
were in the binder, and gave descriptions of each, many 
reviewers were confused. They though I ins teat i that the 
1m fnler contained several sections of one manual Forexam- 
ple, they would look in the index oi the last manual, the 
MS™ -DOS Operating System User's Guide, for page refer- 
ences to the other manuals. Since each of the five manuals 
started with page l-l, reviewers were understandably frus- 
trated. As a result, future loose- leaf binders will each con- 
tain oiily one loose-leaf manual, or will provide clear ways 
for users to realize that It contains more than one 

The* Portable reviewers made many other suggestions for 



manual improvement. Three of the more important sugges- 
tions that have been implemented a 

■ Start each chapter with a table of contents. 

■ Every reference to a function key should be followed 
with the keycap label, like Start (11). 

m Every keystroke sequence that requires pressing Return 

to generate the desired action should include Retur 

the last keystroke. 

The postintToduction test of the Integral PC gave us our 
first chance to test the general manual improvements. Each 
reviewer opened a new box fresh from the production line 
to ensure that the contents were arranged and packaged 
just as actual users would see them when opening their 
newly purchased computer One complaint these reviewers 
had was the difficulty and frustration of tearing the plastic 
shrink wrapping off the manual binders. They were espe- 
cially vocal about the very rugged clear plastic we used for 
the plastic bag containing the setup guide and tutor disc. 
These reviewers suggested we add an easy-open tab to the 
shrink wrapping and use a zip-lock plastic bag for the setup 
guide and tutor disc. These suggestions are being consid- 
ered. 

Our documentation department maintains a revision file 
on all current manuals. When a manual reprinting becomes 
due. the responsible writer checks the appropriate file and 
incorporates the corrections and changes that have col- 
lected since tbe last printing. All reviewer suggestions for 
manual improvements made during the postintrodiu lion 
teal of the Integral PC 1 1 inserted in the appropriate 

manual revision file, provided the suggestions make sense 
(most of them do|. In this way, the next printing of each 
manual will profit from the feedback given to us by these 
reviewers. 

What improvements Have We Made to the Testing Process? 
Each time we conduct a usability lest we learn bow we 
can improve il further, Some of the improvements Wie've 
made to the testing process since we began are: 

■ The task lis! we used for the early tests was quite detail- I 




Fig. 4. The Portable is a 9-tb personal computer with built-in 
software for file management, spreadsheets, graphics, word 
processing, and data communications 
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For instance, the test of The Portable asked each reviewer 
to perform 38 narrowly defined tasks thai we expected 
reviewers to perform in a particular order. For example, 
the first task asked them to turn on the computer. We now 
ask reviewers to complete a smaller series of broader objec- 
tives, and urge them to complete these objectives in any 
logical order, (An example of an illogical order would be 
to start a program from the electronic disc before first copy- 
ing that program from a flexible disc) The first task listed 
on our latest 14-ilern task list asks reviewers to install extra 
memory, but since we urged reviewers to perform tasks 
in any order, one reviewer performed this task near the 
end of his session. 

In the beginning, we used only one microphone, a lapel 
mike for the reviewer. Therefore, only half of the several 
conversations per session between the observer and the 
reviewer were recorded. Now the observer also has a mike, 
and we use a mixer to feed both audio signals to the video 
recorder. 

The videotape of the first test consisted exclusively of 
medium-to-long-distance shots of the reviewer working at 
the desk. Much of the recorded action consisted of review- 
ers turning manna I pages hoping to find answers to their 
problems. Now we only use long-distance shots to show 
the action during unpacking, connecting peripherals, load- 
ing printer paper, etc. As soon as a reviewer starts working 
at the keyboard, we record a close-up shot of the display. 
The main advantage is that the observer can tell what the 
reviewer is doing by watching the computer's display in 
the TV monitor. 
We now record the closing interview, rather than simply 



take notes as we did at first. These produce some of our 
best recordings, since they often contain excellent useful 
comments on our products. 

■ We have always held debriefing meetings, in which the 
reviewers have a chance to give their opinions directly to 
the people in the division responsible for the product. We 
now have added another feature to these meetings — a spe- 
cial videotape lasting one hour or less and containing the 
most significant results of the approximately 50 hours of 
videotape recorded during I he 10 sessions. These have 
proved quite informative, and clearly show the occasional 
sad and funny experiences of new users when they're con- 
fronted with the result of our work. 

» During preintroduction tests, serious and obvious product 
and manual errors are now corrected immediately during 
the test program where possible. This allows us to measure 
Ihese cures during the later sessions of the same test, per- 
mit ting further change if needed before product release. 
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